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ABSTRACT 

The SDSS-III Baryon Oscillation Spectroscopic Survey (BOSS), a five-year spectroscopic survey of 
10,000 deg 2 , achieved first light in late 2009. One of the key goals of BOSS is to measure the signature 
of baryon acoustic oscillations in the distribution of Lya absorption from the spectra of a sample of 
^150,000 z > 2.2 quasars. Along with measuring the angular diameter distance at z ~ 2.5, BOSS 
will provide the first direct measurement of the expansion rate of the Universe at z > 2. One of 
the biggest challenges in achieving this goal is an efficient target selection algorithm for quasars in 
the redshift range 2.2 < z < 3.5, where their colors tend to overlap those of the far more numerous 
stars. During the first year of the BOSS survey, quasar target selection methods were developed 
and tested to meet the requirement of delivering at least 15 quasars deg -2 in this redshift range, 
with a goal of 20, out of 40 targets deg -2 allocated to the quasar survey. To achieve these surface 
densities, the magnitude limit of the quasar targets was set at g < 22.0 or r < 21.85. While detection 
of the BAO signature in the distribution of Lya absorption in quasar spectra does not require a 
uniform target selection algorithm, many other astrophysical studies do. We have therefore defined a 
uniformly-selected subsample of 20 targets deg -2 , for which the selection efficiency is just over 50% 
(~10 z > 2.20 quasars deg -2 ). This "CORE" subsample will be fixed for Years Two through Five 
of the survey. For the remaining 20 targets deg -2 , we will continue to develop improved selection 
techniques, including the use of additional data sets beyond the SDSS imaging data. In this paper we 
describe the evolution and implementation of the BOSS quasar target selection algorithms during the 
first two years of BOSS operations (through July 2011), in support of the science investigations based 
on these data, and we analyze the spectra obtained during the first year. During this year, 11,263 
new z > 2.20 quasars were spectroscopically confirmed by the BOSS, roughly double the number of 
previously known quasars with z > 2.20. Our current algorithms select an average of 15 z > 2.20 
quasars deg -2 from 40 targets deg -2 using single-epoch SDSS imaging. Multi-epoch optical data and 
data at other wavelengths can further improve the efficiency and completeness of BOSS quasar target 
selection. 

Subject headings: surveys - quasars: Lyman-a forest, cosmology: classification techniques 
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1. INTRODUCTION 

1.1. The Baryon Oscillation Spectroscopic Survey 

The current Cosmic Microwave Background (CMB) 
data are in excellent agreement with the theoretical pre- 
dictions of a flat cosmological model with cold dark mat- 
ter which is dominated by dark energy with an equation 
of state parameter, w 



1 (ACDM; Komatsu et al. 



2011| |Larson et al.||2011[ ). Acoustic peaks in the CMB 
amsotropy power spectrum are generated by cosmolog- 
ical perturbations exciting sound waves in the relativis- 
tic plasma of the early universe dSunyaev fc Zeldovich] 
19701 Peebles fc Yu|19701|Bond fc Llfetathiou|1^41 |lij^71 
Holtzman||1989[ |Meiksin et al.||1999p . The scale of tnese 
pea ks, which is set by the sound horizon at last scatter- 
ing (jEisenstei n fc Hu[l998||Blake fe Glazebrook|2003||Seo 
& Eisenstefn 2003), can be used as a cosmological stan- 
dard ruler. These baryon acoustic oscillations (BAO) are 
present in the distribution of matter at late times as well, 
and were fi rst measured in the lar ge-sc ale distribution o f 
galaxies by Eisenstein et al. (2005 1 and Cole et al. (2005). 

BAO should also be present m the distribution of neu- 
tral hydrogen gas in the intergalactic medium, and thus 
should be observable in the Lyma n-q forest (LyaF) ab- 
sorption spectra of distant quasars ([W hite 200 3}|McDon-| 
aid fc Eisenstein|[2007l JSlosar et al.||2009j [iNorman et all 
20091 JBarenboim et al.|2010[ | White et al. j2010| |McQuinn| 



fc White 2011)" Measurements of BAO in the LyaF 
would provide the first measurements of cosmic expan- 
sion and the angular diameter distance at redshift z > 2 
(other than the CMB itself) , a regime not constrained by 
current data, thus giving important constraints on, and 
tests of, the standard cosmologi cal model. 

The Sloan Digital Sky Sur vey fYork et al.|2000|) i s now 
in its third phase (SDSS-HI; | Eisen"stein et al.|201ip , and 
is carrying out a combination of tour interleaved surveys 
that will continue until the summer of 2014. One of 
those surveys, the Baryon Oscillation Spectroscopic Sur- 
vey (BOSS 27 ) commenced operations in late 2009, and is 
using essentially all the dark time for SDSS-III. The key 
goal of the BOSS is to measure the absolute cosmic dis- 
tance scale and expansion rate to an accuracy of a few 
percent from the signature of BA O in the distribution 



of ga laxies and neutral hydrogen ( |Schlegel et al. 2007 
2009 ) . This will be achieved by measuring spectroscopic 
redshifts for « 1.5 million luminous red galaxies and, si- 
multaneously, the LyaF towards «150,000 high-redshift 
quasars 28 . Both samples aim to constrain the equation 
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Quantity/units 


Year 


Full 




One 


Survey* 


Area (deg 2 ) 


880 


10,200 


Target density in NGC (deg -2 ) 


80 


^50 


Target density in SGC (deg -2 ) 


70 


«40 


Total number of Targets / 1 X 10 3 


133 


~440 


Efficiency 


0.26 


>0.40 


Number of z > 2.2 quasars / 1 X 10 3 


13.5 


175 



TABLE 1 

'Projection based on observations through April 2011 and 
dr9 target selection. 



of state of dark energy by measuring the angular diam- 
eter distance, oIa, and the Hubble Parameter, H(z), at 
z = 0.3, 0.6 and ^2.5. In addition to the cosmology goals, 
the unprecedented dataset of z ^2.5 quasars will enable 
tests of black hole growth, wind and feedback models 
and provide insights into the links between galaxy for- 
mation, evolution and luminous AGN activity. Using 
data from the original SDSS quasar survey will also al- 
low studies of spectroscopic variability. BOSS uses the 



same 2.5m Sloan Foundation telescope ( Gunn et al.|2006 1 
that was used in SDSS-I/II, but since BOSS will observe 
fainter targets, the fiber-fed spectrographs have been sig- 
nificantly upgraded. These upgrades include: new CCDs 
with improved blue and red response; 1000 2" instead of 
640 3" optical diameter fibers; higher throughput grat- 
ings over a spectral range of 3600-10000A at a resolution 
of about 2000, and improved optics. 

1.2. Quasar Target Selection in BOSS 

Quasars have colors distinct from those of the much 
more n umerous st ars in the five-color photometry of the 
SDSS ( |Fan||1999| . Unobscured quasars have very blue 
continua, without any breaks redward of the Lya emis- 
sion line, and so can be distinguished from hot stars 
which have a strong Balmer break in the u — g, g — r 
color-color diagram (Figure nl). In particular, at z < 2.2, 
quasars have a UV excess (as measured by u — g) that 
distinguishes them from most stars, and they lie well 
away from the stellar locus at most higher redshifts (but 
see below). SDSS-I/I I targeted quasars for spectroscopy 
( Richards et al.|2002 ) by selecting point sources which lie 
far from the locus of stars in color-color space (and all ex- 
tended sources with a strong UV excess) , as well as point 
sources with radio emission fro m the Faint Radio So urces 
at Twenty cm (FIRST) survey (|Becker et al.|1995| . The 
majority of the more th an 100,00 quasa rs spe ctroscopi- 
cally observed by SDSS ( |Schneider et al.|2010[ ) were tar- 
geted in this way. 

The Lya forest enters the sensitive range of the BOSS 
spectrographs at z > 2.2, and t he number density of 



quasars falls dramatically at z > 3 (Osmer 1982 Schmidt 



et al.|1995| [Richards et al.|2006[ ), so BOSS quasar target 



selection is designed to focus on the range 2.2 < z < 3.5. 
However, at z ~ 2.7, SDSS quasar colors are very si milar 
to tho se of A stars and blue horizontal branch stars ( |Fan 
19991, thus the optimal quasars for studying the Lya 
forest are the most difficult ones for BOSS to target. 
Indeed, the SDSS-I/II quasar target selection algorithm 
deliberately sparse-sampled objects in th e region of color 
space where z = 2.7 quasars should lie (Richards et al. 
2002 1 , in an attempt to minimize the contamination by 
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Fig. 1. — Color-color diagrams of point sources drawn from 7 deg 2 (the BOSS spectrograph field of view) in the SDSS photometric 
database. (Left) 2,400 objects with 18.0 < g < 19.0, and (Right) 7,000 objects with 21.0 < g < 22.0. Most of the objects shown are stars; 
low-redshift (z < 2.2) quasars lie preferentially in the region u — g < 0.6, g — r > where very few stars are found. At z > 2.2, quasars 
become systematically redder in u — g as the Lya forest moves into the u-band and Lya emission moves into g. At z ~ 2.7, quasars have 
colors similar to those of blue horizontal branch (BHB) stars. The larger photometric errors at faint magnitudes broaden the stellar locus 
considerably (especially in the n-band for the reddest stars, which gives rise to the spread at g — r ~ 1.5), illustrating the challenges involved 
in selecting faint objects by their colors. Tracks for the quasar locus, as presented in Bovy et al. (2011b, in prep.) are also shown, with 
the corresponding redshift given by the color-bar legend. Approximate surface densities are quoted, and stellar classifications are given as 
a guide. 



stars. 

The BOSS survey requirements are for spectroscopy 
of 15 or more z > 2.2 quasars deg -2 (150,000 quasars 



over t he BOSS footprint of 10,000 d eg 2 ; l Eisenstein et al. 
2011 1 . Com bining calculations from [ McDonald & Eisen- 



stein] ( |2007[ ) and |McQuinn fc W hite (201 1) wit h the lu- 
minosity function given by jJiang et al.| ( |2006[ ), we find 
that targeting to a magnitude of g < 22 with perfect 
completeness will provide a surface density of 20 z > 2.2 
quasars deg~ 2 . This magnitude limit i s approaching the 



detec tion limit of SDSS photometry (Abazajian et al. 
2004 1 , meaning that photometric errors will significantly 



broaden the stellar locus (Figure[I]) and star-galaxy sepa- 
ration will be a factor. Contamination at both the bright 
and the faint end of the BOSS target range is mainly 
from metal-poor halo A and F stars, faint lower red- 
shift (z ~ 0.8) quasars, and compact galaxies. To put 
these requirements into perspective, the final qu asar cat- 



alog from the o riginal SDSS-I/II quasar survey (Schnei 
der et al. 20101 contained 17,582 z > 2.2 objects over 



Croom et al. 2004) focused 



9380 de f , while the 2dF-S DSS LRG And QSO (2SLAQ) 
survey ( Croom et al.|2009 l, which observed to g < 21.85 
and concentrated on UV -excess objects, contained 1,110 
such quasars selected over 192 deg 2 . The orig inal 2dF 
QSO redshift survey (2QZ; ^ 
on the redshift range z < 2. 

These challenges required a new approach to quasar 
target selection. The first year of the BOSS survey 
("Year One"; 2009 September through July 2010) was 
devoted in part to refining our algorithms for selecting 
these objects. The resulting sample of quasars at z > 2.2 
is comparable in size to the SDSS high-redshift quasar 



sample, and of course reaches much fainter magnitudes 
with much higher surface density. Thus the new sample 
itself represents the best test of our selection algorithms, 
and we modified those algorithms multiple times through 
the year. Year One included roughly three months of 
commissioning of the upgraded BOSS spectrographs and 
instrument control software as well as a steady ramp-up 
to full efficiency operations, so it includes well under 20% 
of the anticipated final sample for the five-year BOSS sur- 
vey. As of April 2011, BOSS is on track to complete its 
intended 10,000 deg 2 of spectroscopic survey area assum- 
ing historical weather patterns and continuation of the 
current observing efficiency. 
Motivated by the firs t science investigati ons based on 



Year One data (e.g., Slosar et al. 2011), this paper 



presents the methods and performance of the quasar tar 
get selection during this year. In what follows, "Year 
Two" will refer both to the spectroscopic observations 
carried out during BOSS' second year, 2010 August to 
2011 July, and the results of the quasar target selec- 
tion presented in this paper over the entire 10,000 deg 2 
BOSS footprint; the distinction should be clear from con- 
text. Data from spectroscopic observations in Years One 
and Two will be included in SDSS Data Release Nine 
(DR9 29 ). The final SDSS-III quasar target selection al- 
gorithm will appear in a separate paper. 

Background quasars have no causal influence on struc- 
ture in the LyaF at the BAO scale 30 . Hence the sample 



29 http://www.sdss3.org/surveys/ 

30 There may however be some measurement bias at the 0.1 — 
1% level for the flux power spectrum, optical depth and the flux 
probability distribution, due to gravitational lensing effects, (see 
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of quasars we use for LyaF cosmological studies may be 
quite heterogeneous, with the only consequence that the 
window function of the survey will depend on the distri- 
bution of the quasars for which we have spectra. Since 
the precision of the BAO measurement improves rapidly 
with the surface density of quasars (at fixed spectro- 
scopic signal-to-noise ratio (S/N)), we have implemented 
a target selection scheme in BOSS that can maximize the 
number of quasars found at z > 2.2 in any area of the 
sky, taking advantage of any available information (e.g., 
auxiliary data). In Year One, we explored a variety of 
methods, settling on our final target selection algorithms 
late in the year. 

At the same time, in order to use the quasars them- 
selves for statistical studies (such as luminosity functions 
or clustering analyses), we must also produce a uniformly 
se lecte d sample, which we refer to hereafter as CORE 
(§ |3.1[ ). However, we changed the definition of the CORE 
sample several times over Year One, as we tested various 
algorithms. Therefore, our fully uniform quasar sample 
will not include data from this first year of the survey. 
However, statistical studies (luminosity functions, clus- 
tering, and so forth) can utilize all five years of BOSS 
data by including moderate incompleteness corrections 
for Year One selection relative to the final CORE al- 
gorithm (see |j6j). We describe the evolution of our al- 
gorithms in detail in this paper, concluding with a de- 
scription of the method we finally adopted. We give the 
target selection for both Years One and Two, and thus 
for the DR9, and analyze our performance from spec- 
tra obtained in Year One. By the end of Year Two, 
quasar target selection (QTS) had been performed over 
the whole 10,000 deg 2 BOSS footprint. Data from Year 
One were gathered over 880 deg 2 ; see Tab le flj 

This paper is organized as follows. In § |2f we describe 
the SDSS photometry on which the target selection al- 
gorithms are most heavily based. Section [3] describes 
our methods for se l ecting quasars jRic hards ct al. 2009a; 
Yeche et al.|[20l0l |Kirkpatrick et al [|2Ull| IBovy et al'| 
201 ip . These four papers suggest different, but comple- 
mentary, methods, and we have used a union of these 
techniques in different combinations through the survey. 
In Section [4] we describe the implementation of these 
targeting methods through the first year. In Section [5j 
we report on the global properties of the resulting sam- 
ple, including high-z quasar targeting efficiency, from the 
data gathered during the first year of the BOSS, and we 
compare the effectiveness of the various methods. In Sec- 
tion M we discuss the production of a statistical quasar 
sample. We conclude in Section [7] and suggest improve- 
ments to BOSS quasar target selection for the remainder 
(Years Three, Four and Five) of the survey. Appendix |A| 
tabulates the logical cuts used on the input imaging data. 
Appendix [B gives more detail about Year One target se- 
lection, while Appendix [O describes a pre-BOSS pilot 
survey using the MMT. Appendix [D] characterizes the 
redshift completeness of our spectroscopic data. 

We assume a cosmological model t hroughout with 
» b = 0.046, n m = 0.228, fl A = 0.725 ( |Komatsu et~aL 



20111. All optical magnitudes are quoted in, and based 



while all near-infrared (NIR) magnitudes are based on 
the Vega system. Throughout the paper, "magnitude" 
refers to SDSS Point Sp read Function (PSF) magnitudes 
( |Stoughton et al.||2002[ ). 



2. SDSS PHOTOMETRY 

2.1. Imaging Data 

BOSS uses the same imaging data as that of the orig- 
inal SDSS-I/II survey, with an extension in the South 
Galactic Cap (SGC). These data wer e gathered using a 
dedicated 2.5 m wide-field telescope ( Gunn et al. 112006]) 
to collect li ght for a camera with 30 2kx2 k CCDs fGunn 
|et al.|1998 1 over five broad bands - ugriz ( |Fukugita et al. 
|1996| ); this camera has imaged 14,555 unique deg^ of 
the sky, including 7,500 deg 2 in the North Galact ic Cap 
(NGC) and 3,100 deg 2 in the SGC ( |Aihara et aL"1|2011| ). 
The imaging da ta were taken on d ark photometric nights 
of good seeing (Hogg et al. 2001), and obj ects were de 



tected and their properties w ere measured ( jLupton et al. 
2001 Stoughton ct al. 2002 ) and calibrated photometri- 



cally (Smith ct al. 2002: Ivczic ct al. 2004] |Tucker et al 



and astrometricaliy 



2006; Padmanabhan et al. 
( |Pier et a l. 2003) 

|Padmanabhan et al. ( 2008 1 present an algorithm which 
uses overlaps between SDSS imaging scans to photomet- 
rically calibrate the SDSS imaging data. BOSS target 
selection uses data calibrated using this algorithm from 
the SDSS Data Re lease Eight (DR8) database (Sec. 3.3; 
Aihara et alj2011 ). The 2. 5° -wide stripe along the celes- 
tial equator in the Southern Galactic Cap, commonly re- 
ferred to as "Stripe 82" was imaged multiple times, with 
up to 80 epochs at each point along the stripe spanning a 
10-year baseline ( |Abazajian et al.|2009 |. In Section [4] we 
will discuss how the commissioning phase of BOSS used 
coadded catalogs in SDSS Stripe 82, generated by averag- 
ing the photometric measurements from ~ 20 indivi dual 
repeat scans; the details are di scussed in Appendix |A.6| 



and in Kirkpatrick et al. 



(2011). 



Roughly 50% of th e SDSS footprint has been imaged 



more than once ( Aihara et al.|2011 ); combining the pho 



tometric measurements in these overlap regions reduces 
the flux errors. 

Using the imaging data, BOSS quasar target candi- 
dates are selected for spectroscopic observation based 
on their PSF fluxes and colors in SDSS bands. Fluxes 
that are used for quasar target selection are corrected 
for Galactic dust ex tinction according to the maps of 
Schlegel et al. ( |1998[ ). All objects classified as point-like 
(OBJC_TYPE =T) and are brighter than g = 22 or 
r = 21.85 are passed to the various quasar target selec- 
tion algorithms. The joint magnitude limit was imposed 
due to concerns of the LyaF moving into the g-band at 
z w 2.3 resulting in suppressed flux at redshifts greater 
than this. In practice, almost all our targets satisfy both 
these conditions. Throughout this paper, magnit udes use 
the asinh sc ale at low flux levels, as described by |Lupton 
eTaLl(fT999). 



up o n the SDSS approximation to the AB zero-point sys 



2.2. Photometric Pipeline Flag and Logic Cuts 
During processing of the imaging data by the SDSS 



tern ( |Oke fc Gunn|1983|[Adelman-McCarthy et al.|2006[ ), photometric pipeline, a numbe r of photometric flags ar e 



e.g., jLoverde et al.|2010| ). 



set for each detected object (|Stoughton et aL 2002) 



These are generated by the SDSS photometric pipeline 
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Fig. 2. — Flowchart for the BOSS quasar target selection, as implemented from the beginning of the second year of BOSS observations. 
The various broad categories of targets, including CORE, BONUS, KNOWN objects, and those detected by the FIRST survey, are indicated, 
and are descri bed in detail in Section[3l SUPPZ refers to a small number of lower-redshift objects targeted to study the effects of metal line 
absorption (§ |4.6| . The flowchart for the first year of BOSS target selection is given in Appendix [Bl The CORE sample is fixed for DR9 
and the remainder of the BOSS. Objects which satisfy the XDQSO probability cut of P(XDQSO)> 0.424 are selected as CORE, and the 
QSO_CORE_MAIN target flag bit is set. CORE selection is based on single-epoch SDSS photometry, but other selections use multi-epoch 
photometry where it is available (e.g., in regions where SDSS imaging stripes overlap). 
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( Lupton et al.|2001~ ), the Resolve algorithm ( Aihara et al 
2011L and by photometric calibration (Padmanabhan 
et al.|[2008 ). Some of these flags indicate problems with 
the de- blending of close pairs of objects. Other flags are 
set due to poor or unreliable photometry, e.g., if an ob- 
ject was saturated due to a bright star's diffraction spike 
or an object was too close to the edge of a frame. If 
these flags are ignored, they can lead to artifacts in the 
imaging data being sel ected as quasar targets. Details of 



these flags are given in |Stoughton et al 
been updated in DR8 ( j Aihara et al.||2l) 



(2002) and have 



There are four distinct sets ot quasars targeted by 
BOSS: targets selected by a uniform method, targets 
selected in a non-uniform way, matches to previously 
known z > 2.2 quasars, and matches to objects in the 
FIRST survey. We refer to these subsets of targets in 
this paper as CORE, BONUS, KNOWN and FIRST, re- 
spectively. 

Each of our targeting algorithms has different imaging 
flag cuts, as well as different flux limits imposed. We re- 
fer to these criteria collectively as "logic cuts." All such 
cuts are applied using single-epoch data with one excep- 
tion: color cuts made on FIRST targets use coadded, 
multi-epoch data wherever these are available. FIRST 
objects are thus not considered to be part of the CORE 
statistical sample, unless they independently meet the 
CORE selection criteria. The logic cuts are described in 
detail in Appendix [A] 

3. METHODS FOR BOSS QUASAR TARGET 
SELECTION 

3.1. Philosophy of CORE and BONUS 

The methods, data and logic flag cuts for BOSS Quasar 
Target Selection (QTS) are summarized in Fig. [2j Dur- 
ing Year One, we carried out QTS and designed spectro- 
scopic plates on areas of ^100-300 deg 2 at a time. We 
refer to these areas, within which all the algorithms used 
in QTS are uniform, as "chunks". Once QTS was more 
settled in Year Two, the areas of chunks could be, and 
sometimes were, more than 1000 deg 2 . For guidance in 
the following discussion, Chunks 1 through 9, inclusive, 
constitute Year One, and Chunks 10 through 18, Year 
Two. Stripe 82 was targeted twice with different target- 
ing algorithms: once in Year One (Chunk 1) and once in 
Year Two (Chunk 11). 

If an object satisfies the selection criteria of one 
or more of our methods outlined below, bits in the 
BOSS.TARGETl target flag are set. Table [2] gives the 
flag name, the bit value and the short description of the 
different target selection flags. 

As discussed in the introduction, we wish to define a 
CORE sample that is uniformly selected over the BOSS 
footprint, for statistical studies of quasars, such as mea- 
surements of the luminosity function and the clustering 
of quasars. While these goals do not drive our techni- 
cal requirements, the survey we have designed to mea- 
sure the BAO signal will also provide an unprecedented 
spectroscopic dataset for studies of quasars themselves. 
Thus, design choices that are roughly neutral with regard 
to cost and impact on the cosmology goals are guided by 
these additional science considerations. 

This is the motivation for dividing our quasar targets 
into two broad classes. Since the one (imaging) dataset 



that we have over the entire BOSS footprint is the SDSS 
single-epo ch photometry (inc luding the new coverage in 
the SGC; Aihara et al. 2011), we define quasar CORE 
targets as a sample of 20 targets deg -2 , which are se- 
lected only from this single- epoch imaging data, using a 
uniform algorithm. As we shall see, the efficiency of the 
CORE sample is near our goal of 50% (i.e. ~10 out of 20 
CORE targets deg" 2 are z > 2.2 quasars). The CORE 
sample is designed to have a well understood, uniform, 
and reproducible selection function. 

In contrast, the "BONUS" sample is selected using 
as many methods and additional data as deemed neces- 
sary to achieve our desired quasar density. The BONUS 
sample has a target density of 20 deg -2 . The number 
of BONUS targets added in each region of sky is ad- 
justed to assure that the total density of targets, CORE 
+ B ONU S, is uniform across the sky, as we will show 
in § |4.7| below. However, as we detail below, the num- 
ber of BONUS targets was extended up to 60 targets 
deg -2 (and then 40 targets deg -2 ), during the BOSS 
Commissioning and early science phases, for a total 
(CORE+BONUS) of 80 (and then 60) targets deg -2 . 
The efficiency of BONUS selection is generally lower than 
that of CORE, despite the use of multiple algorithms and 
auxiliary data, simply because the relatively "easy" tar- 
gets have already been picked by CORE and are therefore 
are not included in BONUS. 

Prior to BOSS, there was no extant survey that suc- 
cessfully targeted z > 2.2 quasars to the depth and sur- 
face density and with the efficiency we needed. The first 
year of BOSS spectroscopy was therefore largely a com- 
missioning year for quasar target selection, during which 
we gathered the quasar sample needed to test our various 
algorithms. In particular, it was only at the end of the 
year that we settled on the final CORE and BONUS algo- 
rithms. Thus, the nominal CORE-selected objects from 
the first year are not a uniformly selected sample. Sec. [6] 
describes the completeness of the final CORE sample in 
Year One spectroscopy. 

Through this first year, we worked on and refined a 
variety of algorithms for BOSS target selection, as it was 
not clear from the outset that any single method could 
meet our scientific goals. These methods include: 

• The Non-parametric Bayesian Cl assification and 
Kern e l Densi ty Estimator (KDE; Richards et al. 

which measures the densities of 



2004 



|2009a[ , 

quas ars and stars in color-co lor space from training 
sets. |Richards et al. ( 2009a[ ) showed that this was 
able to identity quasars at 2.2 < z < 3.5 from SDSS 
photometry with an efficiency of 46.4±5.8%, down 
to a magnitude limit of i — 21.3, approximately 
~ 0.5 magnitudes brighter than the BOSS limit. 



• A likelihood approach (Kirkpatrick et al. 2011), 
which determines the likelihood that each object 
is a quasar, given its photometry and models for 
the stellar and quasar loci. 



A Neu ral Network (NN) approach from Yeche et al. 
(2010), which takes as input the SDSS photometry 
and errors. 



• A variant of the likelihood approach, which ac- 
counts for the observational errors more properly 
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BOSS_TARGETl flag 


bit 


Description 


Used in Year Two? 


QSO_CORE a 


10 


Restrictive quasar scicction 


No 


QSO_BONUS a 


11 


Permissive quasar selection 


No 


QSO_KNOWN_MIDZ 


12 


Known quasar with z > 2.15 


Yes 


QSO_KNOWN_LOHIZ b 


13 


Known quasar with z < 2.15 


Yes 


QSO_NN c 


14 


Neural Net 


Yes 


QSO_UKIDSS d 


15 


K-excess targets 


No 


QSO_KDE_COADD 


16 


KDE targets from the Stripe82 coadd 


No 


QSOXIKE 


17 


Likelihood method 


Yes 


QSO_FIRST_BOSS 


18 


FIRST radio match 


Yes 


QSCLKDE 


19 


Selected by KDE+X 2 


Yes 


QSO_CORE_MAIN e 


40 


Main survey CORE sample 


Yes 


QSO_BONUS_MAIN e ^ 


41 


Main survey BONUS sample 


Yes 


QSO_CORE_ED 


42 


Extreme Deconvolution in CORE 


Yes 


QSO_CORE_LIKE 


43 


Likelihood objects that make it into CORE 


Yes 


QSO_KNOWN_SUPPZ 


44 


Known quasars with 1.80 < z < 2.15 


Yes 



TABLE 2 

The flag name, bit value and the short description of the different target selection flags. 
a QSO_CORE and QSOJBONUS were set only for Chunks 1 and 2, after which the definition of CORE and BONUS changed. 

6 These objects are not targeted. 
c Set if an object is selected by the first stage neural network ( § 1 3 . l\ . 

d THESE OBJECTS WERE ONLY TARGETED ON CHUNK 1. 

e QSO_CORE_MAIN and QSO_BONUS_MAIN were introduced with Chunk 3, and identify the CORE and BONUS samples. 
They appear in tandem with another flag indicating the specific method that selected each object. 
f Set if an object is selected by the NN-Combinator. 



when determining the stellar locus, called "Ex- probability" (see Fig. [2] and Table [3| as: 



treme Deconvolutio n" (XD; Bovy et al. 2009) 



Bovy et al.| (2011) present lull details on how 
the XD method can be used to describe a prob- 
abilistic quasar target selection technique, called 
"XDQSO", that uses density estimation in flux 
space to assign quasar probabilities to all SDSS 
point sources. XDQSO was not used in Year 
One target selection, but it did become the CORE 
method in Year Two. 



Each of the methods described above has one, or more, 
key parameters; these are summarized in Table |3j and 
Table[2]gives the associated bitwise target flags. We now 
describe each of these methods in turn, leaving the details 
for the cited papers. We also introduce a variant of the 
NN, the "Combined Neural Network" (a.k.a. the NN- 
Combinator), which incorporates information from all 
the methods and produces the BONUS sample. We also 
describe several ancillary methods of selection, i nclu ding 
objects associated with FIRST radio sources (§ 3.6) and 



re peat observations of previously known z > 2.2 quasars 
(§fj). 



3.2. Kernel Density Estimation and x 2 cuts 



scheme. 



Gray fc Moore J2003), Gray fc Riegel (2006), and 



Riegel et a l.| ^ U08 ]HIesc r ibe th e K DE c 



assirication 

Richards et al. (120041) and iRichards et al 



(2009a I have applied it to the SDSS imaging data to pro- 



duce photometric quasar catalogs with s» 10 quasars. 
The principles of the KDE are as follows. A sample 
of objects of known classification (stars and quasars) 
serves as a training set, from which the smoothed dis- 
tributions of quasar and star probability as a func- 
tion of color are constructed. This allows one to com- 
pute the probability that any object of interest from 
the test set is a star, "KDE star dens ity" , or quasar, 
"KDE quasar density" (e.g. Fig. 8 in Richards et al. 
2009a). Based on these probabilities, we define the "KDE 



KDE 



Prob 



KDE quasar density 



KDE quasar density + KDE star density ' 

w 

which can be used to decide whether a given object 
should be t argeted as a qu a sar. As described in Sec- 



tion 3.5 of Richards et al. (2009a I, for our purposes 



we define the quasar density just for those objects with 
2.2 < z < 3.5; all other quasars are put into the "star" 
c ategory. 



Richards et al.| ( 2004||2009a| actually define two KDEs, 
split at g = 21, with separate color loci (different "train- 
ings") for the bright and faint estimations. This ap- 
proach crudely accounts for the very different photomet- 
ric errors of the two sets, given that the KDE method, 
as implemented, does not take errors explicitly into ac- 
count . 



et al. (2009a) in the "mid-z" range (i.e. the reds hift range 



Ro ughly 45 % of objects in the KDE catalog of Richards 

" \uT~ 



of interest to BOSS) are not stars (Table 4, Richards 
et al. 2009a), based on an analy sis of the classifica tion 
efficiency using clustering (e.g., Myers et al. |2006|). In 



the absence of significant contamination by galaxies at 
the faint end of the KDE catalo g, the KDE algo r ithm is 
thus about 45% efficient at the Richards et al. (2009a) 
target density of 18.6 mid-z quasars deg~' z . 

We need a higher efficiency for BOSS, so we have ap- 
plied an additional cut beyond that of the Richards et 
al. papers to improve the efficiency of the KDE method. 
This cut is ba s ed on the Xstar statistic introduced by 
Hennawi et al. (20101, which quantifies how far a given 
object is from the stellar locus: 



Xstar 



m—ugriz 



if, 



data 



Aft 



12 



ndclJ 



^taP+^tCodelF 



(2) 



where / is the flux in each of the five SDSS bands 
(m = ugriz) for the data and for the model, cr™ atll is 
the flux error in each band, cr™ odel is the model uncer- 
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Method 



Key 

Parameter(s) 



Variable name 
in target files 



References 



"Kirkpatrick ct al. (2011) 
|Kichards et al.J pjUUya} 
[Hennawi et al.||2UlU[ | 

Ycchc ct al. (201U) 
|Veche et~anH2Ullj| 
|Bovy et aT|^2Ull[ l 

this paper 



Likelihood 
KDE 

Neural Network 

XDQSO 
Combined-NN 



-'prob 



V 

KDE„ 

Y 2 
A, star 

VNN 

2pNN 

P(XDQSO) 
NN Value 



LIKE_RATI0 

KDE_Prob 

chi2_star 

NNJCNN 

NN_ZNN_phot 

QSQED_PR0B 

NN_VALUE 



TABLE 3 

Key parameters for the various methods and the variable name in the output target files. 



tai nty in each band, and A is a normalization. Follow- 
ing Hennawi et al. (2010), the stellar locus is defined by 
a set of « 14,000 stars with accurate photometry from 
SDSS spectroscopic plates, on which all point sources 
were targ eted above a flux limit of i < 1 9.1 regardless 
of color ( Adelman-McCarthy et al. 2006). The mini- 



mum distance to the stellar locus, xttan can the be com- 
puted by minimizing the value x 2 (A,g — i), where A is 
the normalization constant relating the data to a model, 
/c£t a = A f£odeh and 9 ~ i is the color chosen as a proxy 
for stellar temperature. The distribution of the minimum 
distance to the stellar locus , i.e. range of x 2 tar , is shown 
in Fig. 3 of Hennawi et al. (2010[). The crucial strength 



that the xltar cu t adds to our KDE selection is the rejec- 
tion of objects that have colors consistent with those of 
quasars, but have flux errors that make them consistent 
with the stellar locus as well. 

The key parameters (Fig. [2]) for the KDE method are 
the minimum thresholds for selection in both KDE pro b 
and Xg tar . Early in Year One, CORE objects were se- 
lected solely by the KDE algorithm (Section [4|; at that 
time, we applied a limit Xstar ^ Later, when KDE was 
no longer the CORE algorithm, we relaxed this criterion 
to Xg tar > 3. Objects selected by the KDE method have 
the QSO_KDE target flag set. 

3.3. Likelihood Method 

Full details of the Likelihood method, including an in- 
depth analysis of its performance, are presented in Kirk- 



patrick et al. 
Like KDE 



2 



2011). We summarize it briefly here, 
e Likelihood method starts with a sample 
and a sample of "Everything Else" 



of known quasars, 
(EE in what follows), i.e., stars and galaxies, with ugriz 
photometry and errors. One defines likelihoods that a 
given object with fluxes f m and errors a m (m = ugriz) 
is drawn from the quasar or EE catalog by summing a 
X 2 -likc statistic over the full training set: 



£quasar - ^ JJ J 2 7r ( cr m)2 
i m V * 



c 



EE 



exp 



exp 



[f m — quasar™]' 



2{a m ) 2 



If" 



EE? 



2(a m ) 2 



(3) 



(4) 



The sums are over all objects i in the training set. By 
restricting the sum to those training-set quasars in a spe- 
cific redshift range, one can define an equivalent like- 
lihood that the object in question is in this redshift 
range; in Year One, this was done by summing over those 



quasars with z > 2.2. Given these likelihoods, one defines 
a probability that the object is a quasar to be targeted 
(compare with equation IT]) : 



V 



•^quasar ^* 2.2)/j4.q 



Cee/AeE + £quasar(all z) / A 



(5) 



^quasar 



where the ^4s normalize for the possibly different effective 
solid angles of the quasar and EE training sets. In the 
denominator, the likelihood sum is over quasars at all 
redshifts, not just those at z > 2.2. 

Like the KDE method above, this method makes use 
of the varying densities of objects in color space, and in- 
cludes a x 2 selection. Note that it correctly utilizes the 
flux errors in determining whether a given object belongs 
to the quasar or EE class. Potential quasar targets can 
be ranked by their probability V . We define a thresh- 
old (V > 0.234); for V above this value, we target all 
objects as quasars. The Likelihood method was chosen 
as t he C ORE algorithm near the end of Year One (sec- 



4.4). Objects selected by the Likelihood method 



tion 

have the QSO_LIKE target flag set. 

3.4. Artificial Neural Network 
We use an Artificial Neural Network (NN) 



stages of the selection pro cess. 



gorithm may be found in Yeche et al. (2010). 



at two 
Full details of this al- 
As in 



the previous methods, we define training sets of known 
quasars, and objects that are not quasars. 

For the first stage, we use the NN with 10 inputs 
for each object (the SDSS g-band magnitude, the five 
SDSS magnitude errors and the four SDSS colors). The 
training set for non-quasars is a set of ~ 30, 000 SPS S 
point sources from SPSS PR7 (Abazajian et al. |2009[ ), 
selected over the magnitude range 18.0 < g < 22.0 and 
with Galactic latitude b sa 45° to average the effects of 
Galactic extinction. The training set for quasars con- 
sisted of_sjje£tros£opically confirm ed quasars from th e 
2QZ |Croom et ai ][2T104 l), 2SL A^ JCroom et al.|l2009[ ), 
and the SPSS (Schnei der et al.||2010p quasar catalogs. 

The NN developed tor targe t selection has four layers 
of "neurons" (see Fig. 3 of | Yeche et aLl[20To| . The 
fourth layer only has one neuron, providing a single out- 
put parameter, i/nn- The quantity i/nn quantifies the 
probability that an input object is a quasar, although 
since Dnn can be greater than 1, it is not a a probabil- 
ity in the formal sense. A photometric reds hift estimate 



z P NN; is also generated (see Section 5 of |Yeche et al. 
20101, with a cut placed on this photometric redshilt es- 
timate, z p nn > 2.1. Objects selected by the NN method 
have the QSO_NN target flag set. 
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3.5. Extreme Deconvolution 

Extreme deconvolution (XD; Bovy et al. |2009 ) is a 
method to describe the underlying distribution function 
of a series of points in parameter space (e.g., quasars 
in color space), by modeling that distribution as a sum 
of Gaussians convolved with measurement errors. |Bovy| 
et al. (2011) apply XD to the problem of quasar tar- 



get selection, using flux data from the SDSS DR8. The 
so-called "XDQSO" method is conceptually similar to 
the Likelihood method, but explicitly models the non- 
uniform errors of the training set from which the quasars 
and stellar/EE loci are derived. Indeed, the Likelihood 
method effectively double-counts the errors of the train- 
ing set, since the observed distribution of fluxes from 
which the Likelihood training set is built is the true un- 
derlying distribution convolved with the uncertainty dis- 
tribution. XD avoids this double-counting by deconvolv- 
ing the underlying distribution of the training set. 

XDQSO constructs a model of the distribution of the 
fluxes of stars and quasars in different redshift ranges 
based on training samples of known stars and quasars. 
XDQSO then builds a model of the relative-flux distri- 
bution as a mixture of 20 Gaussian components and fits 
this model to the training data, taking the heteroscedas- 
tic nature of the SDSS flux uncertainties fully into ac- 
count. The XD model for the relative-flux distribution 
is fit in narrow bins in i-band magnitude and combined 
with an apparent-magnitude dep endent prior based on 
star counts in Stripe 82 and the Hopkins et al. (20071 
quasar luminosity function. The probability tor an ob- 
ject to be a mid-redshift quasar (2.2 < z < 3.5) is given 
by the ratio between the number density of mid-redshift 
quasars and that of stars plus all quasars at the object's 
fluxes (in the spirit of equation [5]) . The probability that 
a given object is a mid-z quasar is then: 



P(QSO midz |{r}) ex 

P({/"7f }IQSO midz ) P(.f |QSO midz ) P(QSO n 



idz/ > 



(6) 



where m indexes the fluxes and /' is the SDSS z-band 
flux. The first factor on the right is given by the XD 
model for the relative- flux (i.e., color) distribution of 
quasars, while the second and third factors are obtained 
from the quasar luminosity function. The underlying 
relative-flux distribution is convolved with the object's 
flux uncertainties before evaluation. The expressions for 
stars and high/low redshift quasars are similar. Probabil- 
ities are normalized assuming that these classes exhaust 
the possibilities (P(QSO midz ) +P(QSO hil ) + P(star) = 
1). Objects are ranked on their mid-redshift quasar prob- 
ability for targeting. 

Since XDQSO target selection properly takes the flux 
uncertainties into account both in the training and the 
evaluation stage, it can be trained and evaluated on data 
of low signal-to-noise ratio. It can also incorporate data 
from surveys other than SDSS in a straightforward way, 
as we describe for near-infrared and ultraviolet surveys 
below. T he performance of X DQSO, using Stripe 82 data 
is given in Bovy et al. (2011 1 and its performance in Year 
Two will be described m a future paper. The catalog of 
SDSS objects selected by XDQSO is available through 



the SDSS-III DR8 Science Archive Server 31 . 

The XDQSO method was not used during Year One, 
but we then set, and fixed, XDQSO as CORE for Year 
Two and the remainder of the BOSS. In Section|6]we de- 
tail how to replicate the CORE selection using XDQSO 
for the BOSS quasars. Objects selected by the XDQSO 
method have the QSO_CORE_MAIN, and sometimes the 
QSO_CORE_ED, target flag set (see Section [6}. 

3.5.1. The UKIRT Infrared Deep Sky Survey 



Lawrence et al 
United Kingdom 



(2007) presents an overview of the 
ntrared Telescope (UKIRT) Infrared 
Deep Sky Survey (UKIDSS). The UKIDSS is a col- 
lection of five surveys of different covera ge and depth 



using the Wide-Field Camera (WFCAM, Casali et al. 
20071 on UKIRT. WFCAM has an instantaneous field 
of view of 0.21 deg 2 , and the various surveys employ 
up to five filters, ZYJHK, covering the wavelength range 
0.83-2.37/im. T he photometr i c syst em and calibration 



are de scribed in Hewett et al. ( 2006 ) and Hodgkin et al. 
( 2009 ) , respectively. The pipeline processing is described 
in Irwin et al. (201 1, in prep. ) and the WFCAM Science 



Archive (WSA) by |Hambly et al.| ( p008| . The astrometry 
is accurate to 0.1". 

The UKIDSS Large Area Survey (ULAS) aims to map 
~ 4, 000 deg 2 of the Northern Sky, which, when com- 
bined with the SDSS, produces an atlas covering almost 
an octave in wavelength. The target point-source depths 
of the survey are Y = 20.3, J = 19.5, H = 18.6, K = 18.2 
(Vega); the ULAS does not image in the WFCAM Z- 
band. Unlike the SDSS, the ULAS multiband pho tom- 
etry is not taken simultaneously (e.g. Sec. 5.2 of |Dye 



et al.||2006| |Lawrence et al.||2007| Sec. 4.2), so the four 



bands have different coverage maps, with the H and K 
bands obtained together, and Y and J obtained sepa- 
rately. For example, the ULAS "DR8Plus" 32 coverage is 
2,670 deg 2 , 2,685 deg 2 , 2,795 deg 2 and 2,810 deg 2 , in Y, 
J, H and K respectively. 

We use the UKIDSS NIR photometry to improve tar- 
get selection in two complementary techniques. The first 
is to classify quasars by their "K-excess" ("KX"; e.g 



War ren et al.|2000||Croom et al.|2001[ [Sharp et al 
Chiu et al. H20071 IMaddox et aT pOOT 

2011 



P002 



Wu fc jia|2010||Peth et aTp 

SED has an excess in the K- 



^ Smail et al.||2008 

The power-law quasar 
band over a blackbody stel- 
lar SED, allowing quasars to be identified (and stars re- 
jected) that would be normally excluded from an optical 
color-only quas ar selection a l gorith m - even for dust red- 



dened quasars. Peth et al.l (20111) investigated the KX 



method and provided an SDSS- UKIDSS matched quasar 
catalog. For BOSS, KX-selected objects were selected 
early in commissioning and had the QSOJJKIDSS target 
flag set. However, the very low yield (from admittedly a 
small target sample) caused us to drop this method. 

The second method of inclusion of NIR photometry is 
to improve quasar classification, and of particular impor- 
tance for BOSS, photometric redshift estimation, in the 
XDQSO method. Including the NIR flux information 
removes many of the optically-based redshift degenera- 
cies known for quasars (see Bovy et al. 2011b, in prep.). 
Models were trained for SDSS-only fluxes and various 



http: / / data.sdss3.org/sas / drS/groups/boss / photoObj / xdqso / xdcore / 
http: / /surveys. roe. ac.uk/wsa/dr8_las. html 
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Fig. 3.— Redshift versus (u - g) color for BOSS FIRST quasar 
targets. Objects from the BOSS commissioning were either tar- 
geted by FIRST, and also a optical selection, (black) crosses, or, 
they were targeted only as FIRST sources, (red) squares. These 
early findings inspired our (u — g) > 0.4 cut to minimize contami- 
nation from z < 2.2 quasars. 



combinations of SDSS+UK IDSS data, z ~ 2.5 quasars 
have (i - K) ~ 2.1 (e.g., |Peth et al.1|2011[); thus given 



the BOSS quasar survey magnitude limit ol i ~ 21.8, the 
ULAS catalog is too shallow to guarantee 5cr detections 
of all sources. We therefore measure aperture magni- 
tudes in the UKIDSS images at the positions of SDSS 
object counterparts; even low-significance detections can 
be used by XDQSO. Bovy et al. (2011b, in prep) give 
technical details. The SDSS (optical) only model is used 
by XDQSO to generate targets for CORE, where the up- 
per limit of the mid-z bin is z — 3.5. For BONUS, the 
SDSS+UKIDSS model is used to generate targets as an 
input to the NN-Combinator with an upper limit of the 
mid-z bin extended to z = 4.0. This was implemented 
in BONUS from the middle of Year Two (Chunk 16) 
onwards, with significant gains in the yield of z > 2.2 
quasars. 

3.5.2. GALEX: The Far and Near UV 

The Far (1350 - 1750A) and Near (1750 - 2750A) ultra- 
violet (FUV and NUV respective ly) photometry from the 
GALEX Small Explorer mission ( (Martin et al.|2005[ ) also 
provide information that could help discriminate between 
hot stars and z ~ 0.8 quasars, both of which should have 
considerably more flux in the UV than a z > 2 quasar 
because of Lya absorption along the line of sight in the 
latter. 

We have trained the XDQSO technique on SDSS, 
UKIDSS and GALEX input data. Thus we can now 
perform 11-dimensional quasar target selection using the 
FUV/NUVugrizYJHK bands. The relevant GALEX 
surveys are relatively shallow, e.g. muv ~ 20.5 AB, so 
most potential BOSS quasar targets are not detected at 
high significance. Despite this, our tests (detailed in Sec- 
tion [5| confirmed that GALEX measurements — even at 
low significance — do help with target selection. 

We had access to medium-deep GALEX data on Stripe 
82 at the start of Year Two, when w e tar geted the Stripe 
for the second time (Chunk 11; § 4.4]) . We therefore 
incorporated the GALEX FUV and 1NUV fluxes in the 
XDQSO probabilities. 



3.6. Radio Selection 

As in the SDSS-I/II quasar surv ey, objects that are 
detected in the FIRST radio survey (Be cker et al.|1995 l 
are also incorporated in target selection. Radio stars are 
rare, thus most radio sources with faint, unresolved opti- 
cal counterparts are quasars. Optical stellar objects with 
g < 22.00 or r < 21.85 which have FIRST counterparts 
within 1" are considered as potential quasar targets, ir- 
respective of the radio morphology. 

In the early BOSS commissioning data (§B, we simply 
selected all such radio matches. This approach targeted a 
substantial number of quasars with z < 2.2, and thus we 
placed an additional color cut, (u — g) > 0.4, to exclude 
UV-excess sources at lower redshift (Fig. [3]). Thus the 
QSO_FIRST flag designates objects with \u — g) > 0.4 
that matched a FIRST source. Bluer FIRST sources are 
not rejected outright, but are required to pass one of the 
regular optical color selections to be selected. Section [4] 
describes when in Year One this (u — g) > 0.4 cut was 
implemented. 



3.7. Previously Known Objects 

The density of z > 2.2 quasars known before BOSS 
started was ~ 2 objects deg -2 . Given the superior 
throughput of the BOSS spectrographs over those of 
SDSS-I/II, we decided to re-observe these objects for 
improved Lya forest clustering signal. Moreover, this 
allows vital checks of survey quality and uniformity, and 
the data can be used to study the spectroscopic variabil- 
ity of quasars. We thus target previously known spectro- 
scopically confirmed z > 2.15 quasars from the literature. 
We include such objects as targets if they match a point 
source in the target imaging to within 1.5", or if they 
match a point source in the target imaging to within 2" 
and match the magnitude of that object to within 0.5. 

The catalogs of previously known q uasars we use in- 
clude the SDSS DR7 quasar catal og (|Schneid er et al.] 
2010| ), the 2SL AQ quasar catalog ( |Croom et al.| |2009), 
the~2QZ survey flCroom et al.p04T , the AAT-UKlDSS- 
SDSS (AUS) survey (Croom et al., in prep), and the 
MMT-BOSS pilot survey (Appendix O. 

To compare and check our moderate resolution spec- 
tra of generally fainter quasars to those taken by 10m 
class telescopes using high-resolution spectrographs (e.g. 
KECK-HIRES and VLT-UVES), we also mined the data 
archives (the NED 33 , the Keck Observatory Archive 34 
and the ESO Science Archive Facility 35 ) and added those 
quasars with z > 2.15 that were not included from the 
above catalogs. 

The full sample of known quasars contains ~ 18, 000 
z > 2.15 objects. We assign those objects in the BOSS 
footprint the QSO-KNOWN-MID Z flag and give them 
highest targeting priority in tiling ( |Blanton et al.| |2003). 

We also veto previously known low [z < 2.15) 
redshift quasars identified from the SDSS-I/II, 2QZ, 
2SLAQ and MMT surveys, labeling them with the 
QSO_KNOWN_LOHIZ target flag and never assigning 
them spectroscopic fibers 36 . We are confident that we 



33 http:/ /nedwww. ipac.caltech.edu/ 

34 http:/ /www2. keck. hawaii.edu/koa/public/koa.php 

35 http://archive.eso.org/ 

36 The name for this flag, QSO_KNOWN_LOHIZ, is misleading, 
in that it does not explicitly flag high-z quasars. 
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are not inadvertently rejecting any real z > 2.2 quasars, 
since the vast majority of these objects were visually in- 
spect ed and identified in the SPSS, 2QZ and M MT sur- 



veys (Schneider et al. 2010 



of objects with known stel 



Croom et aLl|2005| . A veto 



ar spectra, again from the 



SDSS-I/II, 2QZ, 2SLAQ and MMT surveys, was not im- 
plemented until Chunk 5, because we were not initially 
confident that shallower surveys, at their faint end, would 
have sufficient S/N to correctly identify stars, and that 
our initial matching procedures were not discarding some 
quasars of utility to BOSS. 

3.8. Combinations of Methods 

Combining results from several of the methods de- 
scribed above in target selection requires a method to 
merge the (overlapping) ranked lists from these methods 
into a single ranked catalog. The challenge is shown in 
Fig. [4j which shows the surface density of the union of 
those objects selected by the KDE, Likelihood, and NN 
methods with no further refinement, to yield an average 
target density of ~ 60 targets deg -2 . The tidal stream 



of the Sagittarius dwarf sph eroidal galaxy (Ibata et al. 
1995 Belokurov et al.| [2006) is quite striking in this fig- 
ure, spanning 180° < a < 240° and 0° < S < +15°. The 
target density in Figure [4] varies from 35 to 70 deg -2 . 

3.8.1. Tuning and Ranking 

In the early stages of commissioning, the target density 
was tuned to 80 deg~ 2 using the KDE method (and its 
Xstar parameter). The three main Year One algorithms 
(KDE, Likelihood and NN) were then trained on regions 
where very early BOSS spectroscopy was obtained. Thi s 
was mainly in Stripe 82 (observed in Chunk 1; see § 4.1 ), 
but also some of Chunk 2, yielding ~650 z > 2.2 quasars 
from ^2000 targets. For these initial tests, the limit- 
ing parameters of the KDE, Likelihood and NN methods 
were chosen to give target densities of 80 deg -2 each, 
and each produced a ranked list (based on the value of 
their respective output probability parameter) of targets. 
These three lists were then combined to generate the list 
of the 60 targets deg -2 most likely to be high-z quasars, 
finding the interleaving (without repeating objects se- 
lected by more than one algorithm) of the combined list 
of objects that led to the highest yield of z > 2.2 quasars. 
That is, we first took the first-ranked object from each of 
the three methods, then the second-ranked object, and 
so on, of course not double-counting objects which were 
selected by more than one method. Each of these objects 
is associated with a ranking parameter (as listed in Ta- 
blets]), giving us a relative ranking of the three methods 
which we can use for combining other data in which one 
didn't know a priori which objects were actually z > 2.2 
quasars. This technique was tested by splitting the ini- 
tial data in half and running the ranking algorithm to 
find the thresholds required for each of the three meth- 
ods. Observed targets from the second half of the data 
were also chosen using these calculated thresholds, and 
the yield of z > 2.2 quasars was consistent. The result 
of the combined rankings was to allocate targets to the 
three methods in approximately equal quantity and pri- 
ority. 

3.8.2. NN-Combinator 



We found that the outputs of the three methods could 
be used as inputs into a neural net to improve the yield 
of z > 2.2 quasars. We refer to this approach in what 
follows as the NN-Combinator. This approach can eas- 
ily be expanded to allow input from additional selection 
techniques. 

The key output parameter of the NN-Combinator is 
designated as the "NN value", which is, by design, 
allowed to change from chunk to chunk. The NN- 
Combinator used the data from Stripe 82 obtained by 
BOSS (Chunk 1, see Section 4.1 below) as an input train- 



ing set. The NN-Combinator was the selection method 
for BONUS from Chunk 7 onwards in the survey, drawing 
on the inputs of KDE, Likelihood, and NN. This replaced 



the interleaving method described in £ 3.8.1 



In Year Two, with the advent of the XDQSO method, 
we added the results of this method to the NN Combi- 
nator. In particular, near the end of Year Two, we used 
a versio n of XDQSO that included data from UKIDSS 



(§ 3.5.1) which selected targets to z — 4; the version of 



XDQSO used for CORE used SDSS single-epoch pho- 
tometry only and did not incorporate UKIDSS data. 

3.9. Rationale and Summary 

As the above makes clear, BOSS quasar selection has 
been through a complex series of changes during its first 
two years. Here we recall the reasons for this complexity 
and summarize the main points of this history. 

BOSS quasar target selection is complex because 

• for the survey's defining science goal, measurement 
of BAO in the Lya forest, the primary requirement 
is a high surface density of quasars in the relevant 
redshift range, not simplicity or homogeneity of se- 
lection, 

• selection of quasars in the desired redshift range 
from single-epoch SDSS imaging is difficult because 
of proximity to the stellar locus and substantial 
photometric errors near the magnitude limit for 
BOSS selection, 

• pre-BOSS quasar samples provided inadequate 
training sets in our desired magnitude and redshift 
range, so the quasars we discovered in this first 
year allowed us to refine our algorithms as the year 
proceeded. 

Roughly speaking, the effective survey volume for mea- 
surement of Lya forest clustering is quadratic in the num- 
ber of quasars, so even modest gains in efficiency have a 
significant science impact. 

As discussed in |3.1[ the goal of CORE selection is 
to provide a homogeneously selected sample suitable for 
quasar science. Ideally, we would have frozen the CORE 
algorithm at the very beginning of BOSS, but the higher 
imperative of maximizing efficiency has led us to alter 
CORE as our algorithms improved. We started by us- 
ing KDE+x 2 as the CORE algorithm but switched to 
Likelihood based on its greater flexibility and simplicity. 
Finally, we switched from Likelihood to XDQSO based 
on its better performance (at a level of ~one additional 
high-z quasar deg -2 ). The chunk- by- chunk history of 
these changes is given in Q below. We intend to main- 
tain a fixed CORE algorithm for Years 2—5 of the survey, 
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Fig. 4. — The BOSS quasar target surface density in Equatorial coordinates in the NGC, from a run of the BOSS QTS with a selection 
made by combining the three Year One methods, KDE, Likelihood and NN, in such a way that the average target density over the full 
given NGC area was ~ 60 quasar targets deg — 2 . The color indicates the local number density of targets per square degree. The tidal 
stream of the Sagittarius dwarf spher oida l galaxy is prominent in the region 180° < a < 240°, and 0° < <5 < +15°. The white lines show 
the "Blind Test Area" , described in § |5.5| 



and for many purposes we anticipate that completeness 
corrections will allow use of Year One data in statistical 
studies of the quasar population (see Spj]). 

Beyond CORE, we use whatever combinations of data 
and methods can maximize our targeting efficiency, in- 
cluding known quasars, FIRST candidates, and the 
B ONUS sample. Because the methods described in 
§{ 3.2||3.5 have complementary strengths, we draw on all 
or them in creating the BONUS sample. We have tried 
different methods of forming a combined BONUS list 
during the first year, a nd we have now settled on the 
NN-combinator (j |3.8.2 ) as our primary tool for doing so. 
The individual methods feeding into the NN-combinator 
use co-added SDSS photometry where it is available in 
overlap regions, in contrast to CORE, which relies on 
single-epoch photometry to ensure uniformity. Auxiliary 
data such as UKIDSS and GALEX photometry are fed 
into the XDQSO selection, which in turn is fed into the 
NN-combinator. 

4. BOSS QUASAR TARGET SELECTION FOR 
YEARS ONE AND TWO, CHUNK BY CHUNK 

BOSS is a five year project running from 2009 Au- 
gust to the end of June 2014. Starting in 2009 Septem- 
ber, target selection commissioning (both for the galax- 
ies and quasars) ran alongside commissioning of the new 
hardware and reduction software. The hardware commis- 
sioning was essentially complete by 2009 December (data 
taken earlier were therefore not of survey quality), but 
QTS commissioning continued through 2010 April; dur- 
ing this period the quasar target density was set appre- 
ciably higher (60 or 80 deg -2 ), than for the nominal sur- 
vey (40 deg -2 ). The bulk of the Year One observations 
from MJD=55176 (2009 December 11) to MJD=55383 
(2010 July 6) were thus QTS commissioning data. 

The targeting chunks into which the Year One and 
Year Two data were divided are detailed in Table E] and 



Figure [5] By the end of Year Two, we had run target se- 
lection over the whole 10,000 deg 2 imaging footprint, re- 
sulting in ss 430, 000 tiled targets. This target list is not 
necessarily final - if we obtain data that could improve 
our target selection efficiency in later years of BOSS, we 
will rerun target selection for areas that have not yet been 
observed. Spectra collected during Years One and Two 
will constitute the DR9, and will include 150,000 quasar 
targets, a third to half of which will be z > 2.2 quasars. 
By the end of Year Two, we will have observed all of 
the Year One chunks. The BOSS quasar target selec- 
tion changed from chunk to chunk during the first year, 
as we gathered data and refined our algorithms. These 
changes in the algorithms are detailed in the following 
subsections. 

4.1. Chunk 1 

The first area that we targeted and observed for 
BOSS was SDSS Stripe 82, along the celestial equator 
in the Southern Galactic Cap. The target field covered 
317.0° < aj 20 oo < 45.0°, -1.25° < 5 mm < 1-25°, for 
a total area of 220 deg 2 (smaller than the ~300 deg 2 
imaging coverage on the Stripe). 

The KDE method, based on single-run data and with a 
cut at Xstar > 7 -°> was used as the CORE (QSO_CORE) 
selection for Chunk 1. The KDE method was one 
of the techniques used for BONUS, (QSO_BONUS) 
with targets chosen using the coadded data described 
by Section g) and given the flag QSO_KDE_COADD. 
Coadded data were not used in later chunks, thus the 
QSO_KDE_COADD flag was used only for Chunk 1. In 
Chunk 1, with the benefit of coadded data, the quasar 
and stellar loci were better defined than in the standard 
one-epoch SDSS data. Hence there was far more overlap 
between the samples of sources targeted by all of the 
methods, freeing fibers to be placed on lower-priority 
KDE targets. As most of these lower-priority targets 
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Fig. 5. — The targeting footprint for the SDSSTILBOSS Lya forest /Quasar Survey. The various chunks are indicated by different colors. 
Chunks 16, 17 and 18 lie within the footprint of Chunk 15. The full targeting footprint is 10,200 deg 2 , with a total of -~»430,000 tiled 
targets. Roughly ~ 150,000 of these targets will have spectra by the end of Year Two observations. The global Year One quasar target 
density is 60.4 targets deg -2 , and the mean target density over all chunks shown is 47.9 targets deg -2 . The dashed line is at Galactic 
latitude b = 25°. 



Area 


RA (2000) 


Dec (2000) 


Area 


Total # 


Galactic 


quasar target density 


Method for 


Name 


Range 


Range 


deg 2 


targets (tiled) 


latitude cut? 


deg -2 (tiled) 


CORE 


Chunk 1 


317.0 - 45.0 a 


-1.25 - +1.25 


219.93 


19,205 (18,657) 


no 


87.3 (84.8) 


KDE 6 


Chunk 2 


108.9 - 131.0 


35.6 - 56.2 


143.66 


11,337 (11,024) 


no 


78.9 (76.7) 


KDE 


Chunk 3 


115.7 - 132.8 


28.8 - 44.4 


107.34 


9,476 ( 6,949) 


b > 25° 


88.3 (64.7) 


d 


Chunk 4 


128.7 - 195.0 


-3.3 - 5.0 


306.50 


32,750 (20,679) 


6 > 25° 


106.9 C (67.5) 


d 


Chunk 5 


185.0 - 232.2 


26.2 - 40.7 


245.82 


18,533 (13,418) 


no 


75.4 (54.6) 


d 


Chunk 6 


225.4 - 244.9 


13.5 - 30.53 


186.13 


19,304 (13,130) 


no 


103.7 (70.5) 


d 


Chunk 7 


194.0 - 237.9 


-3.6 - 3.2 


257.01 


10,783 ( 9,596) 


no 


42.0 (37.3) 


Likelihood 


Chunk 8 


240.2 - 253.1 


10.5 - 22.9 


97.82 


4,004 ( 3,500) 


no 


40.9 (35.8) 


Likelihood 


Chunk 9 


316.3 - 330.0 


2.5 - 11.1 


97.54 


3,870 ( 3,360) 


b < -25° 


39.7 (34.4) 


Likelihood 



Year One 1661.75 132,923 (100,313) 

Chunk 10 245.0 - 258.6 17.1 - 30.0 91.14 3,661 ( 3,325) 

Chunk 11 317.0-45.0 |1.25| (219.84) 8,820 ( 8,432) 

Chunk 12 324.6 - 45.1 0.55 - 36.2 2075.9 84,038 ( 77,447) 

Chunk 13 317.0 - 45.0 -9.9 - -0.8 281.7 11,051 ( 10,072) 

Chunk 14 111.8 - 131.5 9.0 - 36.3 347.43 14,165 (13,479) 

Chunk 15 118.9 - 263.9 -0.8 - 68.7 5743.5 233,530 (220,029 

Chunk 16 118.9 - 247.3 -0.8 - 35.6 (3108.3) [128,250 (120,905) 

Chunk 17 118.9 - 247.3 4.4 - 35.6 (2742.4) [116,471 (107,562)' 

Chunk 18 226.9 - 263.9 23.1 - 41.1 (337.20) [13,372 ( 12,699)' 



80.0 (60.4) 

Likelihood 
variability 6 
Likelihood/XDQSO 
Likelihood/XDQSO 
XDQSO 
XDQSO 
XDQSO 
XDQSO 
XDQSO 



no 
no 
no 
no 
no 
no 
no 
no 
no 



40.2 (36.5) 

40.1 (38.4) 
40.5 (37.3) 

39.2 (35.8) 
40.8 (38.8) 
40.7 (38.3) 

41.3 (38.9) 
42.5 (39.2) 
39.7 (37.7) 



Year Two 


8539.65 355,265 (332,784) 


41.6 (39.0) 


Total * 


10,201.4 488,188 (433,097) 


47.9 (42.5) 



TABLE 4 

Details of the 18 chunks targeted for the first two years of BOSS observations. Spectra from each of the 18 chunks will 
be taken during the first two years, but only* an area of ~3000 deg 2 will be covered for spectroscopy. however, we plan 

to observe all of the year one chunks by the end of the year two observations. a the ra and dec ranges give the 
extremities of each chunk area, and thus do not indicate the coordinates of the corners of the chunk footprints. chunks 
16, 17 and 18 lie within the area of chunk 15, hence their areas and targets are not counted towards the total. 

6 From Single-epoch data. 

c chunk 4 uses imaging data in which problems with the u-band data lead to an excess target density (> 106 targets p eg - 2 ). 
d a ranking scheme was used; for chunks 3-6, core included a combination of nn, likelihood, and kde targets (§ 4.3 i. 
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proved to be stars, the overall efficiency of selection of 
the KDE method is thus quite low in Chunk 1. 

Likelihood targets were selected at a target density of 
~35 targets deg -2 using a threshold V = 0.10, Neural 
Network targets at ~20 deg -2 with a threshold Unn = 
0.65, and KDE targets using the coadded data at ^50 
deg -2 . The density of the coadded KDE targets was 
tuned on a second Xstar value calculated from coadded 
data, to obtain the required total of 80 targets deg -2 
total across all methods. This second x 2 tar parameter is 
dependent on right ascension, but is always > 4.0. 

The final Chunk 1 target densities were approximately 
7, 2, 20, and 60 targets deg" 2 for the Known, KX- 



selected (§ |3JU| CORE, and BONUS, respectively (with 
overlap between these categories). At tiling, all quasar 
targets were given priority over all other BOSS targets 
(such as galaxies) for Chunks 1 and 2. The tiling priori- 
ties for all the chunks are given in Appendix[B] (Table[l2]) . 

About 100 deg 2 of Chunk 1 was observed following 
hardware commissioning, i.e., after MJD 55176. Stripe 
82 wa s re-observed in 2010 Fall as part of Chunk 11 



(§ 4.4) 



4.2. Chunk 2 

For this chunk in the NGC (Figure [5]), the targeting 
algorithms were similar, but not identical, to Chunk 1, 
as coadded photometry was not available. The surface 
density of known quasars was lower than in Chunk 1 
(since Stripe 82 contains more extensive spectroscopy 
from prior surveys), and with no UKIDSS coverage, there 
were no KX-selected targets. Unresolved o ptica l objects 



that had a match to any FIRST source (§ |3.6[ ) were in- 
cluded, and given the target flag QSO TTRFT (bit 18). 
The CORE method remained the KDE. The Chunk 2 
target densities were approximately 2, 2, and 20 deg -2 
for the Known, radio-selected and CORE objects, respec- 
tively. 

Objects from the Likelihood, NN and KDE methods 
were targeted for BONUS, using single-epoch data to 
achieve ~35, 20 and 25 targets deg -2 , respectively. As 
in Chunk 1, the KDE was tuned on the Xstar parameter 
to obtain a total of 80 targets deg -2 over all methods. 
In Chunk 2, flux errors are larger than in Chunk 1, due 
to the use of single-epoch data. Thus, the stellar locus 
is expanded and there is far less overlap between the 
targets chosen by various methods. The target density 
of QSO_BONUS sources is thus approximately halved in 
Chunk 2. 

As this chunk used single-epoch data with its larger 
photometric errors, the thresholds for the target selection 
algorithms were modified as follows, giving the target 
densities above: 

• The Likelihood Probability threshold, V, was 
changed from 0.10 to 0.24; 

• The NN probability parameter, unn, was changed 
from 0.65 to 0.70; 

• The KDE algorithm was retrained, using all avail- 
able quasar spectroscopy to date. 

At this stage, the li st of quasars with high-resolution 
spectroscopy (Section 3.7) were added to the database of 



known quasars, although few lie within the boundaries 
of Chunk 2. 

4.3. Chunks 3, 4 7 5 and 6 

We already had our initial spectroscopic results in hand 
from ~20 plates from Chunks 1 and 2 when we identified 
targets in Chunk 3, and we used these results to refine 
our algorithms. In particular, we rejected FIRST sources 
with (u— g) < 0.4, greatly decreasing contamination from 
z < 2.2 quasars, but decreasing the number of FIRST 
z > 2.2 objects by only 10%. The resulting FIRST target 
density drops to ~l-2 deg -2 , 40% of which turn out to 
be 2.2 < z < 3.5 quasars (see Fig. [31. 

In the first two chunks, we found that only 1 new bright 
(i < 17.7) z > 2.2 quasar had been discovered from 486 
bright targets. Thus, a bright limit of i > 17.8 was set 
to reduce stellar contamination at the bright end. Due 
to the proximity of Chunk 3 to the Milky Way, we also 
imposed a Galactic latitude cut of b > 25°. 

There was a change in the target density and method- 
ology from those in Chunks 1 and 2. For Chu nks 3, 4, 5 
and 6, the ranking method described in Section 3.8.1|was 



adopted, allowing us to combine Likelihood, KDE, and 
NN for CORE at 20 targets deg -2 . All remaining tar- 
gets, to a total density of 60 deg -2 , were designated as 
BONUS. To monitor the CORE and BONUS changes, 
two new target flags, QSO_CORE_MAIN (flag bit 40) 
and QSO_BONUS_MAIN (flag bit 41) 37 were introduced. 

The final target densities for Chunks 3, 4, 5 and 6 
were 2 and 1 targets deg -2 for Known quasar and FIRST 
targets, respectively. The CORE target density was ps 
19, 19, 16, 17 deg -2 in the four chunks respectively, and 
the BONUS target density was roughly 40 deg -2 . To 
provide a more uniform galaxy sample, galaxy targets 
were given precedence ove r quasar targets in tiling (see 
Appendix [B| and Table [l2|. 



4.4. Chunks 7-11 

Chunks 7, 8 and 9 were the first chunks which were tar- 
geted at the nominal survey target density of 40 quasar 
targets deg -2 . The area covered by Chunk 9 in the SGC 
was not in the original SDSS survey , and target selection 



was done from the DR8 imaging (Aihara et al. [2011) 
a region of sky where there were no previously known 
z > 2.2 quasars in our catalog (§ J3.7[ ). This change led 
to a lower efficiency (see Section jojr 

For Chunks 7, 8 and 9, based on the tests described 
in Section |5.5| we set the CORE method to Likeli- 
hood, while BO NUS t argets were selected using the NN- 
Combinator 



(§ 



3.8.2). In addition, previously known 
stars from SDSS or 2dF spectroscopy were now ve- 
toed. The NN photometric redshift threshold was re- 
laxed slightly, from z p nn > 2.1 to 2.0. 

Chunk 9 was the last chunk to be observed in Year 
One, and thus the last data included in the spectro- 
scopic sample presented in this paper. Target selection 
for Chunk 10 was performed in the first year of BOSS, 
but the Chunk 10 plates were not observed until the sec- 
ond year, after the Summer 2010 shutdown. Chunk 11, 
the re-observation of Stripe 82, was also observed at the 
start of the second year of BOSS observations. As de- 



scribed in detail by Palanque-Delabrouille et al. ( 2010[ ) 



Now requiring Long64, or "LL" integer type. 
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a variability-based quasar selection was performed for 
Chunk 11. This led to a significantly higher high-z quasar 
densit y than elsewhere in the survey, 24 z > 2.15 quasars 
deg -2 ( Palanque-Delabrouille et al.|2010 ), as we describe 



further in Section [5] 

4.5. Chunks 12 and 13 

The XDQSO method was introduced in Chunk 12 to 
test its efficiency. There is substantial overlap between 
the target list of XDQSO and Likelihood, and including 
the highest ranked 20 targets deg -2 from each yielded a 
total of 25 targets deg -2 . We thus defined CORE to be 
the union of all these targets. The NN-Combinator was 
retained as the method for BONUS. 

4.6. Chunks 14-18 

The Chunk 12 and 13 spectroscopic results demon- 
strated the superiority of XDQSO for the core algorithm. 
Therefore, from Chunk 14 onwards, and for the rest 
of the BOSS, the "Extreme Deconvolution" algorithm 
(XDQSO), and that alone, was set to be CORE. This 
led to a gain of ~ 1 high-z quasar deg -2 in the CORE. 

Various further improvements were implemented in 
BONUS starting with Chunk 14. For Chunk 14^ change 
in fiber collision prioritization (see Appendix pi) led to 
a gain of ~ 1 quasar deg -2 . In Chunk 15 we began a 
policy of re-observing previously known quasars in plate 
overlap regions, leading to a spectroscopic signal-to-noise 
ratio gain of ~ 15% per quasar. In Chunk 16, we incorpo- 
rated UKIDSS photometry into the training of XDQSO 
as an input to the NN-Combinator. This led to a gain 
of 2 — 3 high-z quasars deg -2 where UKIDSS data were 
available. Overlap between adjacent imaging scans al- 
lowed improved photometry for objects observed more 
than once, (Sec. [2]), leading to a gain of ~ 0.3 — 0.5 
quasars deg -2 in Chunk 16. In Chunk 17, an optical- 
only trained version of the XDQSO (essentially what 
is used for CORE) was also used as an input to the 
NN-Combinator used for BONUS, with a gain of - 0.5 
quasars deg -2 . 

BOSS spectroscopic plates are designed by giving pri- 
ority first to BOSS galaxy and quasar targets, follo wed by 
objects in various ancillary programs (Section 2 of Eisen 



stein et al.||2011[ ). If additional fibers are available, we 
assign them to previously known 1.8 < z < 2.15 quasars; 
these are labeled as SUPPZ in Figure [2] and are flagged 
as QSO_KNOWN_SUPPZ in Tabled Reobserving these 
objects allows a measurement of the spectral structure 
from metal lines along the line of sight and spectral arti- 
facts that may contami nate Lya structure measurements 
( |McDonald etaT|[2506| > . 



4.7. The Sky Distribution of BOSS Quasar Targets 

The sky distribution of the BOSS quasar targets are 
shown in Figs. [6j [71 [8] and [9] In Figs. [6] and [7J we show the 
surface densities of BOSS quasar targets for the NGC and 
the SGC, respectively, as selected by the CORE method 
(XDQSO) for DR9. In Figs. [| and M we show the surface 
densities of BOSS quasar targets for the NGC and the 
SGC, respectively, as selected by the CORE (XDQSO), 
BONUS (NN-Combinator) and FIRST methods, as well 
as the inclusion of all previously known z > 2.2 quasars. 



The CORE sample is designed to produce a mean sur- 
face density of 20 targets deg -2 , and although it is rea- 
sonably uniform, the density of targets ranges from 10 
to 30 targets deg -2 over the footprint of the survey. The 
largest variations are associated with Galactic structure, 
with excesses visible at low Galactic latitudes and in the 
Sagittarius stream. The BONUS sample adds enough 
targets in each area of sky to give a much more uniform 
40 targets deg -2 . 

5. RESULTS 

In this section, we present the results of spectroscopy 
carried out during Year One after the completion of hard- 
ware commissioning, from MJD 55176 (2009 December 
11) through MJD 55383 (2010 July 06). The distribu- 
tion of BOSS Year One quasars on the celestial sphere is 
shown in Fig. [T0| 

5.1. Global Properties and Efficiencies 

Table [5] summarizes the results from the first year of 
BOSS quasar observations. BOSS quasar targets are 
those which have one of the target bit flags listed in Ta- 
ble [2] set. There were 54,909 spectra of objects targeted 
as quasars, of which 52,238 were unique objects. These 
were observed over over a footprint of 878 deg 2 , giving a 
mean surface density of 63.8 targets deg -2 . 

Of the 54,909 (52,238 unique) spectra, 35,305 (33,556) 
had high-quality redshifts, as designated by th e "zWarn- 
ing" flag of the spectroscopic pipelin e ( Adelmai>] 
McCarthy et al.||2008| |Aihara et al.||2011| ). From visual 
inspection of the data, the z Warning hag is reliable at 
the 90-95% level for the quasar target spectra; very few 
of the objects flagged as having high-quality redshifts 
(i.e., zWarning=0) are incorrect. We present the per- 
formance of zWarning as a function of magnitude and 



S/N in Appendix [D 
are faint objects wit 



most objects with zWarning ^ 
li low S/N spectra. Given the faint 
magnitude limit of BOSS, it is not surprising that many 
of the targets that are not quasars lack the clearly identi- 
fied spectral features required to assign a high-confidence 
redshift. We will present a detailed examination of the 
performance of the reduction pipeline, the zWarning flag 
and the findings from the visual inspection of the data 
when we publish the BOSS Quasar DR9 Catalog in a 
separate paper. 

Of the 33,556 unique objects with high-quality red- 
shifts, 11,149 are stars, while 13,580 have z > 2.20. 
The remaining 8,827 objects are mostly quasars at 
z ~ 0.8 and ~ 1.6, and low-z compact galaxies; see 
Fig. Ill] Of the 13,580 high redshift objects, 2,317 had 
the QHOJCNOWNJVaDZ flag set; thus the first year of 
BOSS observations resulted in the spectroscopic confir- 
mation of 11,263 new z > 2.2 quasars. A full break- 
down of the number of objects associated with each tar- 
get flag, the number of good (zWarning=0) redshifts and 
the number of z > 2.2 quasars obtained is given in Ta- 
ble! 

Figure [IT] shows the redshift distribution of BOSS 
quasars from the first year, and comp ares it with that 



from the SDSS DR7 quasar sample (Schneider et al 



2010 1 and the 2SLAQ survey 



plot is very similar, but not id entical, to that shown in 



Croom et al. 2009). This 



the SDSS-III overview paper of |Eisenstein et ah] ( |2011[ ) 



1G 



Chunk 


Observed 


Total 


CORE" 


# high-quality 6 


# high-quality 


# CORE high- 




Area (deg 2 ) 


spectra 


spectra 


(zWarning=0) 


z > 2.20 


quality z > 2.20 


1 


37.4 


3811 ( 3174) 


988 ( 849) 


2313 (1909) 


1211 ( 986) 


411 ( 355) 


2 


117.6 


9865 ( 9018) 


2639 (2409) 


7052 (6461) 


2018 (1847) 


880 ( 799) 


3 


33.1 


2191 ( 2142) 


630 ( 616) 


1463 (1433) 


521 ( 513) 


268 ( 264) 


4 


168.3 


11879 (11362) 


3275 (3126) 


6603 (6302) 


2527 (2417) 


1320 (1269) 


5 


186.0 


10344 (10154) 


2924 (2875) 


7132 (7004) 


3376 (3323) 


1714 (1691) 


6 


121.7 


8733 ( 8582) 


2063 (2023) 


5091 (5003) 


1914 (1878) 


915 ( 896) 


7 


120.8 


4615 ( 4506) 


2647 (2581) 


3100 (3027) 


1635 (1595) 


1188 (1160) 


8 


67.0 


2565 ( 2400) 


1697 (1591) 


1891 (1762) 


834 ( 772) 


657 ( 608) 


9 


26.2 


906 ( 900) 


742 ( 738) 


660 ( 655) 


251 ( 249) 


226 ( 224) 


TOTAL 


878.14 


54909 (52238) 


17605 (16808) 


35305 (33556) 


14287 (13580) 


7579 (7266) 



TABLE 5 

Summary of the results from the first year of BOSS quasar observations, chunk by chunk. Numbers in parentheses are for 
Unique objects. "CORE defined as target bit 10 for Chunks 1 and 2, bit 40 for Chunks 3-9 (Table[2}. b High-quality 

REDSHIFTS ARE THOSE FOR WHICH THE SPECTROSCOPIC PIPELINE ZWARNING FLAG IS ZERO. 




Fig. 6. — The quasar target density map in the NGC for the XDQSO CORE targets, displayed in equatorial coordinates. The units are 
targets deg - 2 . 



TARGET _FLAG 


No. of targets 


No. of targets with 


zWarning= 


and 


zWarning=0 




(only) 


z Warning 


5=0 (only) 


z > 2.20 ( 


only) 


and stars 


(only) 


CORE 


3627 


(1509) 


2693 


(890) 


1291 


(89) 


1007 


(619) 


BONUS 


4071 


(2927) 


2631 


(1756) 


546 


(131) 


1558 


(1300) 


KNOWN. MIDZ 


2975 


(529) 


2831 


(490) 


2520 


(357) 





(0) 


KNOWNTOWZ 





(0) 





(0) 





(0) 





(0) 


NN 


17678 


(1111) 


13988 


(791) 


8197 


(152) 


3776 


(562) 


UKIDSS 


139 


(36) 


119 


(33) 


80 


(6) 


22 


(21) 


KDE_COADD 


2407 


(860) 


1517 


(309) 


890 


(31) 


324 


(107) 


LIKE 


30534 


(2541) 


23022 


(1779) 


11793 


(479) 


4712 


(794) 


FIRST 


986 


(530) 


791 


(400) 


403 


(104) 


35 


(34) 


KDE 


27145 


(0) 


16068 


(0) 


7313 


(0) 


5330 


(0) 


CORE_MAIN 


13978 


(0) 


10652 


10) 


6288 


(0) 


2106 


(0) 


BONUS_MAIN 


40363 


(8) 


25218 


(2) 


10616 


(0) 


7588 


(2) 



TABLE 6 

THE TOTAL NUMBER OF SPECTRA OF OBJECTS SELECTED WITH EACH TARGET FLAG FOR YEAR ONE OBSERVATIONS. OBJECTS CAN BE 
COUNTED MORE THAN ONCE; THE NUMBER OF OBJECTS IN ONLY ONE CATEGORY IS ALSO SHOWN. ALSO TABULATED ARE THE NUMBER OF 
GOOD (zWARNING=0) REDSHIFTS, THE NUMBER OF Z > 2.2 QUASARS, AND THE NUMBER OF STELLAR SPECTRA OBTAINED. 
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Fig. 10. — Sky distribution of the 14,287 quasars in the BOSS Year One quasar survey (J2000 equatorial coordinates), in red. The nine 
chunks are labeled a ccordingly, and the dot ted lines are drawn at Galactic latitudes b = ±25°. The spectroscopically confirmed SDSS-I/II 
DR7 quasar catalog (|Schneider et al.|2010} is shown for comparison in black. 



Of course, the DR7 sample is selected over the full SDSS- 
II imaging area, approximately 9,380 deg 2 , while the 
BOSS Year One data come from observations of 880 
deg 2 . Already BOSS has slightly more quasars in the 
z = 2.2 — 2.8 range, while at higher redshifts the DR7 
sample remains larger. 

Degeneracies in the color-redshift relation of quasars 
lead to the selection of \ow-z quasars in BOSS. The 
quasars at z ~ 0.8 have Mgn A2800 A at the same wave- 



length as Lya at redshift z ~ 3.1, giving these objects 
similar broad-band colors, while the large number of ob- 
jects at z ~ 1.6 is due to the confusion between A1549 
C iv and Lya at z w 2.3. We shall come back to this fea- 
ture when comparing the perf orma nce of the NN, KDE, 
and Likelihood methods in § |5.4| The tail of objects 
at z > 3.5 includes a significant contribution from re- 
observations of previously known quasars. 
Figures |T2| and [T3| present our key results, the efficiency 
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Fig. 11. — The redshift histogram of BOSS Year One quasars (solid red thick histogram). The dashed red line represents those objects 
known prior to BOSS o bservations, while the distribution of newly confirmed quasars is given by the thin red line. For comparison the 
SDSS DR7 q uasars fro m Schneider et al. ( 2 010] | (selected over a much larger sky area) are shown by the black histogram, while the 2SLAQ 
quasar data (ICroom et al.|20Uyp, are in blue. 
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Fig. 12. — Cumulative number of quasars with z > 2.2 as a func- 
tion of the rank of the target for the Stripe 82 control sample with 
single-epoch photometry. At 20 fibers deg -2 , the XDQSO CORE 
algorithm selects 10.7 quasars deg -2 , while previously known and 
FIRST sources add an average of 1.5 quasars deg -2 . At 40 fibers 
deg -2 , the total surface density of 2 > 2.2 quasars selected by 
our current algorithms from single-epoch SDSS photometry is 15.4 
deg -2 . Note that these numbers represent an average over a wide 
range of Galactic latitude, and therefore stellar contamination. 



of the current target selection algorithms. For these tests, 
we have constructed a control sample of targets on Stripe 
82, where our spectroscopy is more complete than any- 
where else on the sky, albeit still not perfect. Here we 
include data from Year Two from Chunk 11, where Stripe 
82 was retargeted using a variability selection for quasars 



Fig. 13. — Similar to Figure |12[ but showing the impact of adding 
GALEX photometry, UKIDSSphotometry, or both to SDSS single- 
epoch photometry. This Figure is based on Stripe 82 data and 
XDQSO selection for all targets. 



( Palanque-Delabrouille et al. 2010). Stripe 82 also has 



high completeness because quasars are selected from co 
added photometry, with much smaller photometric er- 
rors. 

For Figure [12] we select the quasar targets in our nor- 
mal way from single- epoch data, with the first 20 tar- 
gets deg -2 selected by the XDQSO CORE algorithm. 
Targets are ranked in order of probability, and the plot 
shows the number of z > 2.2 quasars deg -2 vs. the 
number of targets deg -2 , with the slope of the curve 
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Fig. 14. — Completeness of BOSS single-epoch target selection 
vs. redshift, on Stripe 82. The blue histogram shows the redshift 
distribution of all spectroscopically confirmed quasars on Stripe 82. 
The red histogram is for those quasars that pass the BOSS single- 
epoch target selection for a threshold tuned to produce 40 targets 
deg -2 . Purple points with Poisson error bars show the ratio of the 
two, i.e., the selection completeness (right-hand scale). 



indicating the efficiency of selection. The CORE algo- 
rithm selects 10.7 z > 2.2 quasars deg -2 from its 20 tar- 
gets. We then show the average contribution of KNOWN 
and FIRST quasars, totaling 1.6 high-z quasars deg -2 . 
This increment assumes a surface density of 0.9 known 
high-z quasars deg -2 (and 0.7 deg -2 from FIRST) ^which 
is consistent with our Year One data (see Table pi) but 
lower than the surface density of known pre-BOSSnigh- 
z quasars on Stripe 82, which is unusually well stud- 
ied. Finally, we add the BONUS targets from the NN- 
combinator, again in rank order. At 40 targets deg -2 , 
we are just above the minimum BOSS goal, with a mean 
density of 15.4 z > 2.2 quasars deg -2 . Stripe 82 samples 
a wide range of Galactic latitude and thus stellar density; 
we therefore anticipate that this test should be represen- 
tative of selection efficiency averaged over the full BOSS 
survey region. We also found from observations of early 
chunks, that adding additional fibers beyond the nominal 
40 deg -2 , led to only very minimal gains in yield. 

Figure 13 shows the impact of adding UKIDSS and 
GALEX data to single-epoch SDSS photometry. For this 
test we use the XDQSO algorithm alone, since this is 
where these auxiliary data sets currently enter our selec- 
tion procedures, and we extend the efficiency curves up 
to 80 targets deg -2 . At 40 targets deg -2 , the efficiency 
for XDQSO with single-epoch SDSS imaging alone is 15.0 
z > 2.2 quasars deg -2 . Adding GALEX data improves 
the efficiency to 16.2 deg -2 , adding UKIDSS improves 
it to 17.3 deg -2 , and adding both improves it to 18.6 
deg -2 . Thus, both of these data sets can significantly 
enhance the efficiency of BOSS quasar target selection in 
regions where they are available. Stripe 82 has medium- 
deep ("MIS") GALEX data, and the improvement with 
shallower ("AlS") coverage will be smaller, but our tests 
indicate that GALEX addition will still improve the se- 
lection. 

Fig. 14 shows the redshift distribution of all known 
quasars on Stripe 82 as a function of redshift, as well 




4000 5000 6000 7000 8000 900010000 
Observed Wavelength (Ang) 

Fig. 15. — Examples of spectra of BOSS quasar targets. The 
SDSS object name and pipeline redshift are given in each panel 
(except for the star). From top to bottom: a z > 5 quasar found 
by the Likelihood method; a newly discovered z = 2.6 quasar at 
t he typi cal S/N; a z = 3.5 quasar selected only by the KX method 
(§ |3.5.1[ ); a re-observed BAL quasar showing spectroscopic variabil- 
ity (black line is the BOSS spectrum; red is from SDSS, a spectrum 
taken 3377 days earlier); a star with our typical S/N and a z = 1.5 
quasar with our typical S/N. The feature at 5577A in all spectra 
is a residual from a sky line. 

as those selected by the single-epoch SDSS algorithms 
illustrated in Fig. [12] above. The ratio of the two mea- 
sures the completeness of BOSS single-epoch quasar se- 
lection relative to known quasars in this well studied re- 
gion, ranging from 40% to 70% over our critical redshift 
range 2.2 < z < 3.5. Of course, this remains a lower 
limit to the true completeness at the BOSS magnitude 
limit, though in the 2.2 < z < 3.5 redshift range we an- 
ticipate that the BOSS Stripe 82 sample selected from 
co-ad ded photometry and variability ha s high complete- 
ness ( Palanque-Delabrouille et al. 12010 ). 

Fig.|15|shows examples of BOSS spectra of quasar tar- 
gets from the Year One data. From top to bottom: 
a z > 5 quasar found by the Likelihood method (and 
not selected by any other method); a newly discovered 
z = 2.6 quasar at a typical S/N; a z = 3.5 quasar se- 
lected only by the KX method; a re-observed BAL quasar 
showing spectroscopic variability over 3377 days in the 
observed frame; a star at our typical S/N; and a z = 1.5 
quasar with our typical S/N. 

5.2. Magnitude, Color and the L — z Plane 
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Fig. 16. — Color-magnitude diagram ((« — g) vs. g) for objects 
spectroscopically classified as stars (red contours and points) and 
z > 2.2 quasars (blue contours and points). Only objects with 
zWarning=0 are shown. The quasars are systematically bluer; 
there are very few quasars with g < 18. 



Fig. 16 shows the distribution of quasar targets from 



Fig. 

(i-z) 
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the BOSS first-year data which are spectroscopically con- 
firmed as either stars or z > 2.2 quasars, in the (u-g) vs. 
g color-magnitude plane. The distribution of stars at 
the bright end, g < 18, and the lack of bright z > 2.2 
quasars, led us to impose the bright i — 17.8 limit. Ob- 
jects fainter than g = 22 are brighter than our r band 
limit of 21.85 mag. 

shows the SDSS (u — g), (g — r), (r — i), and 
lors as a function of redshift for the BOSS Year 
One data. Also shown are the mean color in redshift bins 
(thin solid line), and the model of Bovy et al. (2011b, 
in prep.; thick colored line). This model is systemati- 
cally bluer than the data at low redshift; BOSS target 
selection systematically excludes UV-excess quasars, and 
thus those low-redshift quasars that happen to enter the 
sample are redder than the average quasar. The trends 
with redshift arc due to various emission lines moving in 
and out of the SDSS broadband filters, and t he onset of 
the Lyq forest and Lyman-l i mit systems (e.g . , |Fan|1999[ 



tne_Lya forest ana Lyman-i i mi 

Richards et al 1 120021 [20031 |Hennawi et al.T|26l01|Bovy | 
et al. | |201lf and fPeth et al. 1 1 20111 but see al so |Prochaska 
et al 2009| and |Worseck fc Prochaska|2011[ ). McGreer et 
al. (2011, in preparation) will present a detailed analysis 
of this diagram, and its implications for our complete- 



FlG. 17. — SDSS colors vs. redshift for quasars in the BOSS Year 
One data. The thin solid line is the mean color in bins of redshift, 
while the thick colorful line is from the model of Bovy et al. (2011, 
in preparation). The model is systematically bluer than the data 
at low redshift because BOSS systematically excludes UV-excess 
sources. 



Fig. [18] shows the SDSS color-color diagrams for the 
first year BOSS quasars, for all quasars with good 
(zWarning=0) rcdshifts above z — 2.2. This figure il- 
lustrates the redshift dependence of quasar colors as the 
Lya emission line moves from the g band to the r-band 
at z « 3.5. Quasars with 2.2 < z < 3.5 lie in the range 
—0.3 < (g—r) < 0.6, while objects with z > 3.5 generally 
have (a- r) > 0.8. 

Fig. |19| shows the distribution of objects in the redshift- 
luminosity ( " L — z" ) plane for three recent large quasar 
surveys: SDSS (black points), 2SLAQ (cyan) and BOSS 
(red). There are « 105, 000 objects in the SDSS DR7 cat- 
alog, and ~ 9, 000 g < 21.85 low-redshi ft quasars from 

We calculate 



the 2SLAQ Survey (Croom et al 
the absolute i-band magnitudes 



Al 



2009J). 

using the observed 



i-band PS F magnitudes and the fc-corrections given in 

The three surveys to- 
with a dynamic range 



Table 4 of |Rich ards et al 
gether cover the L 



(|2006 

z plane we. 



ncss. 



in luminosity of ~ 4 magnitudes at any given redshift up 
to z ~ 3.5. This coverage will be vital for calculating the 
evolution of the faint end of the quasar luminosity func- 
tion, and placing strong constraints on the luminosity 
dependence of quasar clustering. 

5.3. Comments on Several Chunks 
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Fig. 18. — Color-color diagrams for the First Year data for all spectroscopically confirmed quasars with good (zWarning=0) redshifts 
above z = 2.2. The stellar locus is shown as contours. Top left, ugr; top right, gri, bottom left, riz. The horizontal swath of both stars and 
quasars at g — r ~ 1.5 in the u — g, g — r color-color diagram is caused by the large it-band photometric errors in the reddest objects. The 
colors of points encode their redshifts; the sizes of the points vary for clarity. The lower right panel shows the i magnitude as a function of 
the g — r color. 



Because of the BOSS hardware commissioning in Fall 
2009, only 37.4 deg 2 (out of a possible 220 deg 2 ) were 
observed in Chunk 1 under survey-quality conditions af- 
ter MJD 55169. Thus Stripe 82 was re-target ed, re-tiled 



Delabrouillc 



and re-observed for Yea r Two as Chunk 11 (Palanquc- 
et al.|2010j >. However, the non-survey qual- 
i prior to 1VL 



ity data from prior to MJD 55169 were visually inspected 
during the very early part of the survey, and used to in- 
form subsequent QTS decisions. 

In Chunks 1-6, the quasar target selection algorithm 
was generous, allocating 60-80 targets deg -2 . Chunk 7 
was the first time we ran the BOSS QTS at the nom- 
inal 40 targets deg -2 . Of the 4,506 unique targets in 
this chunk, 1,595 (35%) are classified as z > 2.20 ob- 
jects with zWarning=0 (Table [5]). Although this does 



not reach the BOSS efficiency goal of 50%, there are sev- 
eral reasons that this number can be considered a lower 
bound. First, Chunk 7 is in the region of sky known 
to have a high density of faint stellar sources, due to 
the presence of the tid al stream of the Sagittarius dwarf 
spheroidal g alaxy (se e |Ibata et al.|1995 1997 Belokurov 
et al.|[2006| and our Fig. [I]) . Second, visual inspections 
of the spectra identified OTo-l more high-z quasars per 
square degree than the pipeline, and while not all of these 
might be suitable for LyaF analyses (e.g., due to BALs 
which cause the pipeline to fail), there should be a net 
gain upon production of the final BOSS quasar catalogs. 
Finally, and potentially most importantly, we know that 
our target selection methods and algorithms have contin- 
ued to improve, with the incorporation of XDQSO and 
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Fig. 19. — The L — z plane for three recent quasar surveys: SDSS-I/II, (black points), 2SLAQ (cyan) and BOSS (red). The luminosity 
assumes H = 70 k ms" 1 Mpc" 1 . There are a 105,000 objects in the SDSS DR7 catalog and a 9,000 g < 21.85 low-redshift quasars from 
the 2SLAQ Survey i Croom et al,||2009} . The three surveys together give a dynamic range in luminosity of ss 4 magnitudes at any given 
redshift up to z ~ 3.5. The luminosity corresponding to magnitude limits of i = 22 on the faint end and i = 18 on the bright end are 
shown. The coverage here can be compared to Fig. 5 injCroton ( 2009} . 
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Fig. 20. — The BOSS quasar redshift distribution for objects with 
reliable redshifts (zWarning=0), selected by our three main meth- 
ods from Year One. The green, blue and black histograms give the 
redshift distributions for the Likelihood, NN and KDE meth ods, 
respectively. The red histogram is the full sample from Figure [TT] 
These methods were not applied uniformly through Year One, so 
this plot is shown for qualitative and informative purposes only, 
and should not be used as a direct comparison between the meth- 
ods. The KDE, NN and Likelihood algorithms are not mutually 
exclusive, with many objects selected by more than one method. 



ancillary data such as UKIDSS and GALEX (see also the 
discussion on a variability based QTS in § [7]). 

In this context, the performance of QTSin Chunks 8 
and 9, with only 11.5 and 9.5 z > 2.2 quasars deg~ 2 re- 
spectively, was disappointing. Chunk 8 lies at relatively 
low Galactic latitudes, and is affected by stellar contam- 
ination. Chunk 9 is in a region of sky where there was 
neither previously known quasars nor FIRST radio cov- 
erage. We continue to observe the rest of Chunks 8 and 
9 in Year Two. 



Selection 


# Quasar 


# with 


and with 


or arc 




targets 


zWarning=0 


z > 2.20 


stars 


Totals 


52,238 


33,556 


13,580 


11,149 


KDE 


34,503 (4794) 


20,993 (2693) 


9,050 (229) 


7,607 (1,856) 


NN 


16,747 ( 975) 


13,267 ( 710) 


7,743 (135) 


3,604 ( 504) 


Likelihood 


29,150 (2325) 


21,975 (1647) 


11,244 (447) 


4.483 ( 724) 



TABLE 7 

The number of unique quasar targets from the first year 
OF BOSS SPECTROSCOPY, broken down by the three key 
selection methods. numbers in parentheses indicate the 
number of objects selected by the indicated method only. 
Because these methods were applied non-uniformly, this 
table IS PROVIDED AS AN informational guide, AND NOT AS A 
DIRECT COMPARISON BETWEEN METHODS (SEE TEXT FOR FURTHER 
EXPLANATION) . 



5.4. Comparison of Algorithms 

The original motivation for the implementation of mul- 
tiple target selection algorithms was the lack of evidence 
prior to BOSS observations that a single method could 
select z > 2.2 quasars down to g ~ 22 with our required 
efficiency. With the Year One data now in hand, we can 
compare the effectiveness of our different methods. How- 
ever, due to the continually changing nature of the BOSS 
QTS over this year, where different methods were used as 
CORE and BONUS, these comparison will be generally 
qualitative in nature. The interested reader is referred 



to the discussions in Bovy et al. (2011 1 for further com- 
parisons. 

As an aid for our discussions, we give a condensed ver- 
sion of Table ffl in Table [7J where we list the number of 
targets from this first year, broken down by the three 
key selection methods. Again, given the non-uniform 
selection over this year, this table is provided as an in- 
formational guide only; it should not be used as a direct 
comparison between methods. 
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The redshift distributions for objects with reliable red 
shifts selected by our three main methods (NN, KDE 
and Likelihood) are given in Fig. 
the non-uniform manner in whic 
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Again, because of 
these methods were 



applied during Year One, this plot should not be inter- 
preted as a quantitative comparison between the meth- 
ods. There is substantial overlap between the methods; 
many objects are selected by more than one technique. 
The three histograms have similar shapes over the range 
2.2 < z < 3.5. While NN avoids being confused by 
z ~ 1.5 objects, and KDE avoids objects at z > 3.5, 
all three methods select a substantial number of objects 
at z ~ 0.8. 



Figs. 21 22 and 23 show the color-color and the color- 
magnitude distributions of z > 2.2 quasars selected by 
the Likelihood, NN and KDE methods, respectively. The 
figures show in orange and black the ratio of numbers of 
objects selected by each method to the total number of 
Year One quasars, at each point in color space. This ratio 
is normalized to the global ratio of targets from Column 
4 of Table [7J thus a point in color space with a value 
> 100% is one where the method in question outperforms 
the total selection on average. The difference between the 
three methods is clear in the (u — g) vs. (g — r) diagrams. 
The contours for the Likelihood method are fairly flat 
away from the stellar locus. NN performs well at (it — 
g) ~ 0.6, (g — r) ~ and in those regions of color-color 
space corresponding to higher-redshift quasars, but does 
more poorly elsewhere. KDE selects objects only over 
a very narrow range in (g — r). From the (<? — r) vs. 
i-band color-magnitude diagram (bottom right panels of 
the figures), we see that the Likelihood method was more 
efficient at selecting fainter, i > 21.0 quasars, while the 
NN tends to select the brighter i < 20.0 objects at all 
(g — r) colors. 

These trends can be understood given the methodol- 
ogy of these algorithms. The Likelihood method down- 
weights objects close to the stellar locus as the denom- 
inator of equation ([5| gets large, which is why Likeli- 
hood selects few objects there. Otherwise, the Likeli- 
hood method traces the overall BOSS Year One sample 
in color-color and color-magnitude space. The Likelihood 
method did not place any cuts on photometric redshift, 
and hence samples the high redshift distribution of the 
BOSS data well, especially at (g — r) > 1 (corresponding 
to redshift z > 3 . 5). We refer the interested reader to 



Kirkpatrick et al. (2011 ) for full details of the Likelihood 
performance. 

At the crux of an artificial neur al network is the s am- 
ple of objects used to train it (see Yeche et al.J|2010 and 
references therein, and Section 3.4). The training set for 
the NN we have used was based on the SDSS quasar cat- 
alog and the 2SLAQ surveys, and did not use data from 
the MMT pilot survey (Appendix [C]) or the AUS sur- 
vey. Thus, this training set was geared towards brighter 
quasars (i < 20.2), giving rise to the tendency for NN to 
select the brighter quasars. 

The KDE training set included only 2.2 < z < 3.5 
quasars, and thu s th e redshift histogram drops to zero 
at z = 3.5 (Fig 20 1. This is related to the fact that 
libit a much narrower range of the 



KDE quasars inf 

(g — r) vs. (r — i) color-color plane t han the other two 
methods. In summary, Figures 2T|23 reflect the relative 
strengths and trainings of these methods; ultimately, the 



three methods complemented each other well. 

5.5. The Blind Test Area 

After spectroscopy from the first few chunks had been 
analyzed, it became clear that the survey would have to 
decide on a single method for the CORE, and that we 
would have to restrict ourselves to the nominal target 
density of 40 targets deg -2 . Thus, we designed a test 
to decide which combinations of methods gave the best 
yields for the CORE and BONUS selections. 

The "Blind Test Area" is a region of sky of - 1000 
deg 2 in the NGC at high declination (5 > +40°) and high 
Galactic latitudes, shown by the thin white line in Fig. [4 
This area is used for tuning the threshold of each methoc 
to a particular target density. The resulting thresholds 
were then applied to existing data to determine the se- 
lection efficiency. 

Table M summarizes these tests. This table gives the 
surface density of 2.2 < z < 3.5 quasars from early 
(Chunk 1, 2 and 3) BOSS spectroscopic data that would 
be recovered by various methods at various thresholds of 
their key parameters when they are tuned to yield a sur- 
face density of 20 or 40 deg -2 in the blind survey region. 
The effectiveness of each quasar spectrum for Lya for- 
est studies depends on its redshift (and thus the spectral 
coverage of the forest) and its brightness (and thus the 
S/N of the spectrum). This "value" is quantified by a 
sc ore of each quasar, motivated b y the checks performed 



McDonald & Eisenstein ( 2007 1 ; summing this over the 



expected quasars per square degree gives the numbers 
m Table H These scores do not include contributions 
from quasars outside the redshift range 2.2 < z < 3.5. 
"Weighted Likelihood" was an adaption of the Likelihood 
method to maxim i ze thi s score, as discussed in detail by 
Kirkpatrick et al.| ( |2011 |. 

We also tried selecting quasars using a simple color 
region isolating the region where z ~ 2.7 quasars are 
found, akin to the mid-z box used by [Richards et al. 



( 2002 1 , but this did not deliver an efficiency close to our 



requirements. 

Although Table [8] shows that the KDE method returns 
the most z > 2.2 quasars (9.45 deg -2 ) at the CORE 
target density of 20 deg -2 , after much deliberation, we 
selected the Likelihood method as CORE for the lat- 
ter stages of Year One, since it is a simpler algorithm 
to understand and explain, it has a more uniform spa- 
tial selection, and is easier to reproduce. Further tests 
showed that using the Neural Network in its "Combina- 
tor" mode for BONUS would yield the highest number 
of high-z quasars overall. The difference when weight- 
ing by the Lya forest score was to o small to motivate 
us to i nclude it; see the discussion in McQuinn & White 

poTT i. 

Howev er, tests of the Yea r One data with the XDQSO 
method (Bovy et al. 2011) showed it selected about 1 
z > 2.2 quasar deg~ z more than Likelihood. Thus in 



Chunks 12 and 13 (Section 4.5) the union of Likelihood 
and XDQSO was treated as CORE, allowing us to tes t 
them directly against one another (Bovy et al. 2011). 
In Chunks 12 and 13, 2426 out of 4710 XDQSO targets 
had spectra with zWarning=0 and 2.2 < z < 3.5, for an 
efficiency of 52%, while Likelihood obtained 2296 quasars 
from 5086 targets, for a 45% efficiency. This result is our 
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Method 


Threshold 


Threshold 


^quasar (deg -2 ) 


Nquasar (deg -2 ) 


Score (deg 2 ) 


Score (deg 2 ) 




@ 20 deg" 2 


@ 40 deg -2 


@ 20 deg" 2 


@ 40 deg' 2 


@ 20 deg" 2 


@ 40 deg" 2 


KDE 


0.904 


0.599 


9.45 


11.35 


4.79 


5.71 


Likelihood 


0.543 


0.234 


8.70 


12.23 


4.39 


5.89 


Weighted Likelihood 


0.262 


0.108 


8.89 


12.33 


4.58 


5.98 


NN 


0.852 


0.563 


7.62 


10.84 


4.00 


5.51 


NN Combinator 


0.853 


0.573 


9.37 


12.81 


4.69 


6.26 


Color Box 


n/a 


n/a 


6.45 




3.41 





TABLE 8 

The surface density of spectroscopically confirmed 2.2 < z < 3.5 quasars from early (Chunk 1, 2 and 3) BOSS 

SPECTROSCOPIC DATA THAT WOULD BE RECOVERED BY VARIOUS METHODS, AND THE THRESHOLDS OF THE KEY PARAMETERS (TABLE |3j) 
REQUIRED TO YIELD A SURFACE DENSITY OF 20 OR 40 DEG -2 IN THE BLIND SURVEY REGION (§ 1 5 ■ 5 [ ) ■ THE WEIGHTED LIKELIHOOD 
INCORPORATED A WEIGHTING FUNCTION WHICH OPTIMIZES THE S/N OF THE LYO; FOREST CLUSTERING SIGNAL. THE REDSHIFT AND FLUX 
DISTRIBUTION OF THE RESULTING QUASAR SAMPLE DETERMINES THIS SIGNAL, AS QUANTIFIED BY THE SCORE IN THE LAST TWO COLUMNS. 




-0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1 1 2 3 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1 1 2 3 

(r - i) (g - r) (r - i) (g - r) 



Fig. 21. — Distributions in color-color and color-magnitude space 
for z > 2.2 quasars selected by the Likelihood method in Year One. 
The black contours give the location of the stellar locus, while 
the orange contours give the ratio, at each point of color space, 
of 2 > 2.2 quasars selected by Likelihood to all Year One BOSS 
quasars, normalized to the global ratio of the two. Quasar numbers 
were smoothed with a tophat of width 0.10 mag in u — g and g — r, 
and 0.05 mag in r — i and i — z, before taking ratios. 

motivation for declaring XDQSO to be CORE for the 
rest of the BOSS quasar survey. 

6. THE COMPLETENESS OF CORE IN YEAR ONE 

Studies of clustering in the Ly a forest are not biased by 
the distribution of background quasars used to illuminate 
Ly a forest absorption. Thus the Year One BOSS quasar 
sampl e can be used for these studies. Indeed, [Slosar et ah] 
(|2011[ ) have performed a first clustering analysis of Lya 
forest flux from the BOSS Year One data. 

However, given the changes in QTS throughout the 
first year, the quasar sample described in this paper is far 
from sufficiently uniform to be used directly for studies 
of the statistics of the quasars themselves, such as mea- 
surements of their luminosity function or clustering. The 
goals of the CORE sample is to have such a uniformly- 
selected sample of quasars, but as the definition of CORE 
changed several times during commissioning, CORE ob- 
jects in the first year do not represent a statistical sample. 



Fig. 22. — As in Figure [2l) for the NN method. 
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Fig. 23.— As in Figure |2l) for the KDE method. 
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246.1 
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0.861 


5 


243.0 


232.0 


0.952 


6 


182.6 


171.2 


0.933 


7 


205.0 


185.8 


0.836 


8 


75.5 


65.7 


0.814 


9 


84.1 


71.6 


0.822 


10 


71.7 


60.7 


0.813 



Chunk 


Bits to Select 


12, 13 


40 AND 42 


14 and onwards 


40 



TABLE 9 

Fraction C of objects that would have been targeted by 

the a posteriori XDQSO CORE algorithm, which were 
actually targeted, for each Year One chunk. Chunk 11 
has greater area coverage than chunk 1, thus we list it 
INSTEAD. The second column gives the solid angle (in deg 2 ) 
of the region of each chunk in which the completeness is 
greater than 0.75, the third column lists the same value 

BUT FOR effective area (i.E AREA X COMPLETENESS) AND THE 
FOURTH COLUMN TABULATES THE MEAN COMPLETENESS OVER THE 

chunk. See also Fig. l24l 



The project set tled on the XDQSO algorithm (§ 3.5 



Bovy et al.||2011 | for the CORE method at the endo 
Year One, and will use it for the rest of the survey. It is 
therefore useful to apply this algorithm to the photom- 
etry used in the Year One spectroscopy, and determine 
the completeness of the Year One targeted chunks. Ta- 
ble [9] and Fig. [24] give the results of this test. Given the 
placement and overlap of the spectroscopic plates, each 
chunk can be uniquely divided into sectors covered by a 
unique combination of plates. The completeness of the 
targeting: i.e., the fraction of the XDQSO CORE sources 
that were actually targeted in Year One, is measured for 
each sector separately. Encouragingly, these targeting 
completeness values are generally 80% or higher, which 
indicates that statistical analyses of the final CORE sam- 
ple should be able to incorporate Year One data by in- 
troducing moderate weighting factors. The lower target- 
ing completeness (65%) on Chunk 11 highlights a sub- 
tle point: the completeness for CORE-selected quasars 
should be higher than the completeness for CORE tar- 
gets as a whole, because the true quasars are the most 
likely to also be selected by one of our other algorithms. 
In the case of Chunk 11, the deeper Stripe 82 photom- 
etry eliminates many noisy stellar contaminants in the 
single-epoch XDQSO target list, but it probably selects 
nearly all of the true quasars selected by CORE. 

For Year Two and the remainder of the BOSS quasar 
Survey, the core sample is defined by bossAargetl flag 
QSO-CORE_MAIN (bit 40) and QSO-CORE_ED for 
Chunks 12 and 13, and QSO.CORE.MAIN (bit 40) only 
for later chunks (Table \l(fy . 

For calculations of the quasar luminosity function, one 
must also account for the incompleteness of the XDQSO 
CORE sample relative to the full population of quasars. 
This can be quantified, fo r example, using the exten- 
sive targeting on Stripe 82 ( |Palanque-Delabrouille et al. 
2010 1. Similarly, to determine completeness as a tunc- 



tion of position on the sky for quasar clustering work it 
is necessary to determine the fraction of quasars hiding 
among the unclassifiable spectra (see Appendix [D]). On- 
going visual inspections of these spectra will address this 



TABLE 10 

The BOSS_TARGETl flag values that need to be set in 

ABLE TO SELECT A CORE SAMPLE FROM YEAR TWO 
OBSERVATIONS ONWARD. 



question to some extent. 

7. CONCLUSIONS AND FUTURE PROSPECTS 

This paper describes the BOSS quasar target selec- 
tion algorithms during the first two years of BOSS ob- 
servations. BOSS aims to obtain spectra of a sample of 
^150,000 z > 2.2 quasars, in order to probe structure in 
the Lya forest to provide a percent-level measurement 
of the expansion history of the Universe, by measuring 
baryon oscillations in the Lya forest clustering. This 
first year was a commissioning period for quasar target 
selection, and the algorithms for identifying quasar can- 
didates varied significantly over the year. 

Our key results are: 

• We have performed quasar target selection (QTS) 
over 10,200 deg 2 of the SDSS-III imaging footprint, 
producing a list of 488,000 targets. These objects 
are selected to be at redshift z > 2.2, motivated 
by the need to observe the Lya forest in the BOSS 
wavelength coverage. 

• After a year of testing and evolution of the 
BOSS QTS, we settled on the Extreme Deconvo- 
lution method as our uniformly-selected subsample 
(CORE) and a neural network Combinator for the 
BONUS sample. 

• Having the BONUS selection allows us to imple- 
ment improvements throughout the survey, e.g., 
through auxiliary photometric data. This has al- 
ready been achieved with the inclusion of NIR 
YJHK photometry from the UKIDSS and UV 
data from GALEX, increasing our z > 2.2 quasar 
yields by ~ 2 — 3 deg -2 . 

• We obtained spectra of 54,909 objects selected by 
the quasar target selection algorithms over a foot- 
print of 878 deg 2 during the first year of observa- 
tions, the mean target density is 63.8 targets deg -2 . 

• Of these 54,909 spectra, 33,556 were unique objects 
and had high quality spectra. 11,149 had redshifts 
z < 0.02, and 13,580 had redshifts of z > 2.20 (of 
which 11,263 were not previously known). 

• Our mean z > 2.2 quasar surface density was 15.46 
z > 2.20 quasar deg -2 , with a global efficiency of 
26.0%. 

• The z > 2.2 objects selected by the three main 
methods used during Year One are found in dif- 
ferent regions in color-color and color-magnitude 
space, reflecting in part the fact that the meth- 
ods were trained for different redshift ranges. The 
three methods complemented each other well, and 
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Fig. 24. — The fraction of the objects would be targeted using the final version of the XDQSO CORE quasar target selection, that were 
actually targeted in Year One. Each panel shows the area covered by a Chunk (2-11) from Year One. We use Chunk 11 on Stripe 82 in 
place of Chunk 1 (top-left panel) as Chunk 11 has superior areal coverage. Note that in some chunks, the scales on the RA and Dec axes 
are quite different. Color coding shows the spectroscopic completeness of the a posteriori XDQSO CORE sample for each area. Those 
areas in red have a targeting completeness above 0.75, orange have a completeness of 0.5-0.75, green have a completeness of 0.25-0.5 and 
the few areas in blue have a completeness below 0.25. The top right panel shows the cumulative area (blue solid line) and effective area 
(area X completeness; black dashed line) above a given level of targeting completeness for the XDQSO CORE sample. 



together select 60-70% of all quasars in our magni- 
tude range with 2.2 < z < 3.5. 

• Working with single-epoch SDSS data, our cur- 
rent target selection algorithms slightly exceed the 
BOSS technical goal of selecting 15 z > 2.2 quasars 
deg - 2 from 40 targets deg -2 (Eisenstein et al. 
2011 ). The tests on Stripe 82 indicate an ethciency 
of 15.4 quasars deg -2 , of which 11.2 deg -2 



use of auxiliary imaging data, including GALEX, 
UKIDSS, and additional SDSS epochs in overlap 
regions, will boost our efficiency by 1 — 4 quasars 
deg -2 , significantly increasing the statistical power 
of BOSS Lya forest clustering measurements. 

• All BOSS spectra from the first two years of ob- 
servations, August 2009 through to July 2011, will 
be made publicly available in the next SDSS data 
release, DR9. 

We continue to investigate ways to improve quasar 
target selection. We have already described the incor- 
poration of data from ultraviolet (GALEX) and near- 
IR (UKIDSS). Da ta from the Wide-fie ld Infrared Survey 
Explorer (WISE; | Wright et al~1|2010l) will provide pho- 
tometry at mid-infrared wavelengths for our targets; it is 
deep enough to detect at least the brighter quasars in the 
BOSS sample. Variability as measured from repeat scans 



is an important method 
rate quasars from stars 
82 study by 



independent of colors, to sepa- 
Buildin g on the SDSS Stripe 
( 2007 ), rec e nt investigations 



.Sesar et al . 

by Pala nque-D clabrouillc et al. (20l ti|), |Butler fc Bloom 



(2011 



Richards et al. 



from known quasars pl us t he CORE selection at 
20 targets deg -2 (Fig. 12). We anticipate that Ihara et al 



invigorated 
variability selection. 
In addition to Stripe 



(I20IID, IMa cheod oFaE t 

|Koziowski et al.| ( |2011 ) ana Sarajedini et al. J 2011 i, have 
re-invigorated the held of AGIN identification through 



82, roughly 50% of the SDSS 



imaging footprin t has been imaged more than once (Ai- 
LI2011P 
Hi 



primarily in overlaps between adjacent 
stripes. However, most of this area is observed only a 
few times, over timescales of days, rather than the de- 
sired month or year baselines that lead to efficient AGN 
selection. 

In this regard , the Palomar Transient Factory (PTF; 
Law et aL]|2009 | 38 could be a natural dataset to use for 
this purpose. The PTF is an automated, wide-field imag- 
ing survey aimed at the exploration of the optical tran- 
sient sky. PTF uses the 1.2m Schmidt telescope at Palo- 
mar Observatory with a 8 deg 2 field-of-view to perform 
large area transient searches. An area of several hundred 
deg 2 can be imaged in one night, typically in the Mould 
i£-band but also in the SDSS g-band. We are actively in- 
vestigating the inclusion of PTF imaging data into BOSS 
QTS. 

PTF could also potentially aid BOSS QTS by improv- 

38 http:/ /www. astro. caltech.edu/ptf/ 
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ing star/galaxy separation at the faint end. Potentially 
any of the PTF variability methods could work with 
other transient /variabi lity based surveys a s well, e.g. the 
Pan-STARRS survey, ( |Kaiser et al^[2002| . 
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APPENDIX 

APPENDIX A: QUASAR TARGETING LOGIC CUTS 

This Appendix describes the various quality cuts that objects from the SDSS photometric pipeline must satisfy to 
be considered for selection using the algorithms described in § [3] Target selection is restricted to sourc es that are 
unresolved in SDSS imaging, as determined by the difference between the model and PSF magnitudes (Stoughton 
et al. || 2002[ ); such objects are flagged with DBJC_TYPE = 6 in the outputs of the SDSS photometric pipeline (ILupton 
et al.pOlf . 

i'o reduce processing time, we precalculate a number of combinations of flags from the photometric pipeline and 



the photometric calibration ( Padmanabhan et a l. 20081. These flags are used in different ways for different target 
selection algorit hms , as summarized in Table for example, we are not as stringent for objects selected as FIRST 
radio sources (§ 3.6) as we are for those objects which are selected by their colors. In the main text, we refer to various 
combinations of the six flag combinations described in this Appendix. 



Is the Photometry Clean? 

The photometric pipeline sets a series of flag bits for each detected object which identify probl ems with the processin g 



of the SDSS photometry, ranging from the presence of bad columns to issues with deblending (Stough ton et aL|2 002). 
These are particularly useful in recognizing when the photometry might be poor, and ther efore color selection of T argets 



unreliable. The detailed meaning of the specific flag bits in what follows is d escribed in Stought on et al. (2002) and 



Richards et al | ( |2002[ ). 

recks on each band separately, and just 



the SDSS-III web page 39 ; the logic behind these flag combinations is given in 
Note that unlike the latter paper, we did not calculate and apply the flag c 
use the flags associated with the union of the detections in the five SDSS bands. While this could cause us to reject 
some genuine quasars, checks on Stripe 82 (where the flag checking on the coadded data was significantly less strict; 
see below) showed only a statistically insignificant 1% difference in the number of quasars identified. 

We first define a combination of flag bits that denotes whether the source in question was adversely affected by 
interpolation across bad pixels, bad columns, or bleed trails: 



INTERP .PROBLEMS = (PSF_FLUX_INTERP && (gerr > 0.2 
(INTERP .CENTER && CR), 



rerr > 0.2 || ierr > 0.2)) || BAD_C0UNTS_ERR0R 



a combination that identifies objects in which the deblending of overlapping images may be questionable: 

DEBLEND .PROBLEMS = PEAKCENTER || N0TCHECKED || (DEBLENDJJOPEAK && (gerr > 0.2 || rerr > 0.2 || ierr > 0.2)) 

and a combination which identifies objects with detectable proper motion between the exposures in the different 
SDSS filters (asteroids): 



http : //www . sdss3 . org/dr8/algorithms/photo_f lags .php 
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Flag Name 


Bitmask 


Description 


CORE/BONUS 


FIRST 


KNOWN 


GOOD 


11 


Target has clean SDSS photometry 




X 


X 


GMAG_BITMASK 


11 


Target meets the magnitude limits 




X 


X 


GMAG_BITMASK_NOB 


12 


Used when no bright cut is required 


X 


V 


X 


RES0LVEJ3ITMASK 


13 


Target is a primary target in SDSS photometry 




V 




B0UNDSJ3ITMASK 


16 


Target lies within the SDSS target footprint 


X 




V 


FIRST_COLOR_BITMASK 


17 


Color cut for objects that match a radio source 


X 


V 


X 



TABLE 11 

Flags used by BOSS Quasar Target Selection. 



MOVED = DEBLENDED_AS_MOVING kk (rowv/rowverr) 2 + (colv/colvcrr) 2 > 3 2 . 

Here, the symbols (kk, ||, !) have their standard meanings from Boolean logic. The quantities rerr, gerr, and ierr are 
the quoted uncertainties in the PSF photometry in g, r, and i respectively, rowv and colv are the measured proper 
motion along the rows and columns of the CCD, and rowverr and colverr are their errors. 
A source is considered to have clean photometry if it satisfies the following: 

GOOD = BINNED1 kk ! BRIGHT kk ! SATURATED kk ! EDGE kk ! BLENDED kk ! NODEBLEND kk INOPROFILE kk 
! INTERP .PROBLEMS kk ! DEBLEND_PROBLEMS kk ! MOVED . 



Magnitude Limits 

The GMAG_BITMASK records whether a target satisfies the magnitude limits requir ed to be targeted as a quasar. 
Magnitude cuts are made on PSF magnitudes measured by the SDSS, corrected for Schlegel et al. (1998) Galactic 
extinction. These limits are encoded in a : 



GMAG_BITMASK = (g < 22 || r < 21.85) kk i > 17.8. 

This includes a cut at the bright end, reflecting the fact that bright z > 2.2 quasars are extraordinarily rare. We 
also define a variant of this flag: 

GMAG_BITMASK_NOB = (g < 22 || r < 21.85), 

to be used when no bright cut is required — such as when retargeting known quasars or FIRST objects. 

Resolving Image Overlaps 

The DR8 paper ( Aihara et al. 2011 ) describes the algorithm used to define the primary detection of a given object, 
if it lies in the ~ 5U7o ot the SDSS footprint covered by more than one scan. The RESOLVE_BITMASK records whether 
a source is a primary target in the SDSS photometry. 

Boundary Logic 

The BOUNDS_BITMASK records whether a source is within the footprint of the SDSS imaging, which is useful for 
keeping track of data from the ancillary surveys (FIRST, UKIDSS, GALEX) used in the target selection. 

FIRST Color Logic 



We saw in § |3.6| that we could limit the number of z < 2.2 sources targeted by FIRST with u — g color cut. Thus 
we define: 

FIRST_COLOR_BITMASK = (u-g> 0.4). 



Conditions for Generation of Stripe 82 Coadded Photometry 

The single-epoch photometry used for coaddition on Stripe 82 is first vetted by a series of quality cuts. All fluxes 
used in the coaddition are limited by the following conditions: 

• They must be primary, i.e., RESOLVE_BITMASK must be true; 

• They must be observed under photometric conditions (an important issue from Stripe 82, as it was repeat edly 
observed under non-photometric conditions as part of the SDSS Supernova Survey; see Frieman et al.|[2008 ); 

• They must have a positive estimated inverse flux variance (zero values are indicative of problems with the data); 

• They must pass various flag cuts: 

(!DEBLEND_T00_MANY_PEAKS kk ISATUR kk IBADSKY kk !SATUR_CENTER kk ! I NTERP .CENTER kk 
!DEBLEND_NOPEAK kk !PSF_FLUX_INTERP). 
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Area Name 


Target Selection Version label 


Tiling Priority 


Chunk 1 


comm 


all quasar targets before galaxy targets 


Chunk 2 


conim2 


// 


Chunks 3, 4 


main002 


all galaxy targets before quasar targets 


Chunks 5, 6 


main005 


// 


Chunks 7, 8 


main006 


// 


Chunk 9 


main006-masksgc 1 


// 


Chunk 10 


main006-collate-maskngc 


KNOWN before galaxies; galaxies before CORE, BONUS, FIRST. 


Chunk 11 


vcat-2010-07-02 


// 


Chunks 12, 13 


main008-sgc40 


// 


Chunk 14 


main008-edcore-maskngc40 


KNOWN, CORE, FIRST over galaxies; galaxies before BONUS. 


Chunk 15 


main008-edfinal-maskngc40 


// 


Chunk 16 


main010-maskngc40 


// 


Chunk 17 


mainO 1 l-maskngc40 


// 


Chunk 18 


main012-nosuppz-maskngc40 


// 



TABLE 12 

This table lists the internal label of the version of target selection code used in each chunk, and also explains the 
relative priority of different classes of target in the case of fiber collisions. 



APPENDIX B: FLOWCHART FOR YEAR ONE QTS AND TARGET SELECTION VERSIONS 

Figure [2] is a flowchart which describes quasar target selection as it was carried out in Year Two and beyond. Fig. [25] 
gives the equivalent for Year One. The red numbers give the bitwise value for the boss_targetl flag. Those values 
with asterisks have target flags that were obsolete after the first year of target selection. 

Table 12 gives the BOSS quasar target selection version code label for each chunk. Sheldon et al. (2011, in prep.) 
will describe in detail the differences between these versions. 

Because of the 62" diameter of the cladding around each optical fiber, two objects with separation smaller than that 
angle cannot both be observed on a given spectroscopic plate, which means that an algorithm to decide which of two 
objects in such a collision should take prec edence is needed. Our thinking on this evolved throughout Year One; the 
rules for each chunk are given in Table 12 By Chunk 14, we settled on giving KNOWN, CORE, and FIRST quasar 



targets higher priority than galaxy targets, with BONUS at lower priority. 

APPENDIX C: QUASARS FROM THE MMT PILOT PROGRAM 

Prior to the commencement of BOSS spectroscopy, we carried out spectroscopy of quasar candidates selected from 
coadded photometry in SDSS Stripe 82 to increase the number of faint quasars available in the BOSS redshift range 
for testing and training of BOSS targeting algorithms. Candidate quasars for these observations were select ed in two 



Hennawi 



ways: first, using very inclusive cuts m the (Xp hot , Xstar) plane, where these x 2 statistics are as defined in 
et al. (2010), and second, using the methods outlined in |Richards et al" ( 2009a|b ). These observations were intended 
to include as large a sample of z > 2.2 quasars as possible, but do not represent a statistically well-defined sample, so 
we do not describe their selection in greater detail. 

Observations of these candidates were carried out in queue m ode between 2008 September and 2009 January using 
the Hectospec multi-fiber spectrograph (Fabricant et al.||2005|) on the 6.5m Multiple Mirror Telescope (MMT). The 



data wer e reduced using Juan Cabanela's ESPECROAD 41 ' pipeline, an external version of the SAO SPECROAD 
pipeline (Mink et al. 2007). Quasars were identified by eye, and redshifts were measured using IRAF. 

The MMT program was conducted before the release of the S PSS DR7 quasar catalog. In addition, BOSS targets all 
confirmed quasars from the MMT program for re-observation (§ 3.7). Thus, most of the MMT observations have been 
superseded by subse quen t SDSS DR7 or BOSS spectroscopy at better resolution, wavelength coverage and signal-to- 
noise ratio. In Tables 13- 14 we provide positions, PSF photometry (as observed, uncorrected for Galactic extinction), 
and redshifts for confirmecTquasars from the MMT survey. Objects that are not flagged Primary in the CAS are listed 
separately. Over 99% of quasars that were observed a second time have redshifts in agreement (to Az < 0.05) between 
the MMT survey and the SDSS/BOSS pipelines. 



APPENDIX D: PERFORMANCE OF ZWARNING 

Here we present the fraction of spectr oscopically observed quasar target s which are flagged with zWarning ^ by the 
spectroscopic pipeline. As described in Adelman-McCarthy et &i\ ( |2008[ ) , this is an indication that the automatically 
derived red shift and classification are not reliable. 

Table |l5| gives the three most common of the zWarning flag bits for quasar targets, a short description of each, and 
the number of objects with these bits set. 1851 objects have both bits 2 and 6 set. All other zWarning bits are set in 
200 or fewer objects, representing less than 1% of the sample. 

Fig. [26] gives the fraction of objects with good, zWarning=0, spectra as a function of i-band magnitude and spectro- 
scopic S/N per pixel (median over the spectrum). The (black) histogram shows the distribution of all objects to give 
a sense of where the majority of the signal arises from. The most common flag is SMALL_DELTA_CHI2, indicating that 
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INPUT SDSS PHOTOMETRY 

I 





QSO_KNOWN 
MIDZ 12 



QSO_KNOWN LOWZ 

(never target) | 3 



MAGNITUDE CUTS (g PSF <22.0 OR r PSF <2 1 .85) and i PSF >l7.8 



YES AND 

(u-g) > 0.4 



QSO FIRST 
18 




QSO_ 
UKIDSS 
15* 




NN 

yNN and zpm 



Likelihood 
P 



YES 1 0* 

QSO_CORE 



19 

YES QSOKDE 



QSO_NN |4 

I 



QSOLIKE 
17 



rYES 
QSO BONUS 




NN-Combinator 

NN_VALUE 



/ i 



QSO KDE COADD 16* 



QSO_BONUS_MAIN 



41 



QSO_CORE 
MAIN 

40 

(chunks 7- 1 0) 



Fig. 25. — Schematic flowchart for the BOSS quasar target selection during the first year of observations, to be compared with the Year 
Two version in Figure[2] The red numbers give the bitwise value for the boss_targetl flag (see Table[2]l. The red numbers with asterisks 
have target flags that were obsolete after the first year of target selection. T he input SDSS photometry is described in Section [2] and the 
algorithm to resolve overlapping images is e xplai ned in |Aihara et al,| ( |2011^ . Previously known obj ects are described in Section |3.7| and 
the FIRST radio selection is given in Section |3.6| The photometry Hags are discussed in Section |2.2| and in Appe ndix A. The three t arget 
selection methods (KDE . NN, and Likelihood) are described in Richards et al. ( 20 09a[ and references therein) , |Yeche et al.| l [2010| | and 
|Kirkpatrick ct al. (201lJ, respectively, and are outlined in Section [3] Each ot these methods produces one or more continuous parameters 



KDE method; yjviv an< i 2 p,NN f° r the first Neural Network and NN_VALUE for Neural Network Combinator. The target selection flag bits are 
also shown, with descriptions in Table [2] Objects with i < 17.8 with FIRST counterparts are selected for spectroscopy. 
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23 
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15 


44.9 


21.865 
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21.065 


0.031 


20.686 


0.038 


20.524 


0.038 


20.174 


0.089 


2.477 


00 47 55.49 


+00 


14 


42.3 


23.320 


0.546 


21.762 


0.053 


21.431 


0.056 


21.214 


0.064 


20.483 


0.138 


0.822 



TABLE 13 

Quasars discovered in the MMT survey. Many of these objects were subsequently confirmed in the SDSS DR7 quasar 

CATALOG OR IN THE BOSS. IMAGING INFORMATION IS TAKEN FROM THE SDSS DR8 CATALOG ARCHIVE SERVER. THE FIRST 10 OBJECTS 
ARE GIVEN TO SHOW THE FORMAT OF THE TABLE. THIS TABLE IS AVAILABLE IN ITS ENTIRETY IN MACHINE- READABLE AND VIRTUAL 

Observatory (VO) forms in the online journal. 



RA 


DEC 


u 


U C rr 


g 


gcrr 


r 


r crr 


i 


Icrr 


z 


z crr 


redshift z 


00 46 39.91 


-00 05 03.7 


21.784 


0.184 


21.572 


0.075 


21.498 


0.096 


21.396 


0.114 


20.750 


0.264 


2.235 


00 57 16.14 


+00 21 04.7 
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21.755 
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22.929 


0.861 


2.402 


03 37 10.37 


+00 23 55.1 


20.082 


0.074 


19.279 


0.123 


18.978 


0.150 


18.922 


0.166 


18.938 


0.129 


2.920 


03 37 33.89 


-00 03 04.7 


21.458 


0.120 


20.264 


0.259 


19.667 


0.021 


19.282 


0.024 


19.127 


0.050 


0.671 


22 58 58.68 


-00 20 38.0 


21.924 


0.317 


21.373 


0.078 


21.077 


0.085 


20.967 


0.105 


20.497 


0.294 


2.421 


23 07 33.34 


-00 17 58.9 


21.981 


0.186 


21.884 


0.088 


22.491 


0.209 


21.815 


0.163 


21.323 


0.334 


2.765 



TABLE 14 

Quasars discovered in the MMT survey that are non-primary in SDSS DR8 imaging 



zWarning flag 


bit 


Description 


No. of objects in Year One (unique) 


No flag set 

SMALL_DELTA_CHI2 
NEGATIVE.EMISSION 


2 
6 


Spectrum has no known problems. 

X 2 best fit is too close to that of second best (< 0.01 in reduced x 2 ) 
a quasar line exhibits negative emission. 


35,305 (33,556) 
16,765 (15,982) 
620 (597) 



TABLE 15 

zWarning flag bits and Year One Quasar Spectroscopy 



there is more than one template that fits the spectrum. This is most commonly seen in low S/N spectra. We hope 
that planned visual inspections of those objects with zWarning ^ will allow positive identification of many of these 
objects, boosting the number of confirmed high-redshift quasars. 
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Fig. 26. — The fraction of Year One quasar targets with good redshifts (zWarning=0) as a function of i-band magnitude (left) and median 
spectroscopic S/N (left). Objects with zWarning=0 are given by the dotted lines, objects with zWarning=4 are given by the dashed lines, 
and objects with zWarning=0 || zWarning=4 -representing 95% of our sample- are given by the solid lines (i.e. dashed+dotted = solid). 
Also shown separately are objects classified spectroscopically as stars (red) and high (blue) and low (green)-redshift quasars, as indicated. 
Their sum is given by the black lines. Also shown as the histogram and the right-hand y-axis is the distribution function of objects. 
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